Table 1 



BCA4 DNA sequence (SEQ ID NO:l) 

Gene name: osteoblast specific factor 2 (periostin) ; Unigene number: Hs. 136348; Probeset 

Accession #: D13666; Nucleic Acid Accession #: NM__006475; Coding sequence: 12-2522 (start 
and stop codons underlined) 

AGAGACTCAA G ATGA TTCCC TTTTTACCCA TGTTTTCTCT ACTATTGCTG CTTATTGTTA 60 

ACCCTATAAA CGCCAACAAT CATTATGACA AGATCTTGGC TCATAGTCGT ATCAGGGGTC 12 0 

GGGACCAAGG CCCAAATGTC TGTGCCCTTC AACAGATTTT GGGCACCAAA AAGAAATACT 180 

TCAGCACTTG TAAGAACTGG TATAAAAAGT CCATCTGTGG ACAGAAAACG ACTGTTTTAT 24 0 

ATGAATGTTG CC CTGGTT AT ATGAGAATGG AAGGAATGAA AGGCTGCCCA GCAGTTTTGC 3 00 

CCATTGACCA TGTTTATGGC ACTCTGGGCA T CGTGGG AG C CACCACAACG CAGCGCTATT 360 

CTGACGCCTC AAAACTGAGG GAGGAGATCG AGGGAAAGGG ATCCTTCACT T ACTTTG C AC 420 

CGAGTAATGA GGCTTGGGAC AACTTGGATT CTG AT AT CCG TAGAGGTTTG GAGAGCAACG 4 80 

TGAATGTTGA ATTACTGAAT GCTTTACATA GTCACATGAT TAATAAGAGA ATGTTGACCA 54 0 

AGGACTTAAA AAATGGCATG ATTATTCCTT CAATGTATAA CAATTTGGGG CTTTTCATTA 600 

ACCATTATCC TAATGGGGTT GTCACTGTTA ATTGTG CTCG AATCATCCAT GGGAACCAGA 660 

TTGCAACAAA TGGTGTTGTC CATGTCATTG ACCGTGTGCT TACACAAATT GGTACCTCAA 720 

TTCAAGACTT CATTGAAGCA GAAGATGACC TTTCATCTTT TAGAGCAGCT GCCATCACAT 780 

CGGACATATT GGAGGCCCTT GGAAGAGACG GTCACTTCAC ACTCTTTGCT CCCACCAATG 84 0 

AGGCTTTTGA GAAACTTCCA CGAGGTGTCC TAGAAAGGTT CATGGGAGAC AAAGTGGCTT 900 

CCGAAGCTCT TATGAAGTAC CACATCTTAA ATACTCTCCA GTGTTCTGAG TCTATTATGG 96 0 

GAGGAGCAGT CTTTGAGACG CTGGAAGGAA ATACAATTGA GATAGGATGT GACGGTGACA 10 2 0 

GTATAACAGT AAATGGAATC AAAATGGTGA ACAAAAAGGA TATTGTGACA AATAATGGTG 10 8 0 

TGATCCATTT GATTGATCAG GTCCTAATTC CTGATTCTGC CAAACAAGTT ATTGAGCTGG 114 0 

CTGGAAAACA GCAAACCACC TTCACGGATC TTGTGGCCCA ATTAGGCTTG GCATCTGCTC .12 00- 

TGAGGCCAGA TGGAGAATAC ACTTTGCTGG CACCTGTGAA TAATGCATTT TCTGATGATA 126 0 

CTCTCAGCAT GGTTCAGCGC CTCCTTAAAT TAATTCTGCA GAATCACATA TTGAAAGTAA 13 2 0 

AAGTTGGCCT TAATGAGCTT TACAACGGGC AAATACTGGA AACCATCGGA GGCAAACAGC 13 8 0 

TCAGAGTCTT CGTATATCGT ACAGCTGTCT GCATTGAAAA TTCATGCATG GAGAAAGGGA 144 0 

GTAAG CAAGG GAGAAACGGT GCGATTCACA TATTCCGCGA GATCATCAAG CCAGCAGAGA 150 0 

AATCCCTCCA TGAAAAGTTA AAACAAGATA AGCGCTTTAG CACCTTCCTC AGCCTACTTG 156 0 

AAGCTGCAGA CTTGAAAGAG CTCCTGACAC AACCTGGAGA CTGGACATTA TTTGTGCCAA 162 0 

CCAATGATGC TTTTAAGGGA ATGACTAGTG AAGAAAAAGA AATTCTGATA CGGGACAAAA 16 80 

ATGCTCTTCA AAACATCATT CTTTATCACC TGACACCAGG AGTTTTCATT GGAAAAGGAT 174 0 

TTGAACCTGG TGTTACTAAC ATTTTAAAGA CCACACAAGG AAGCAAAATC TTTCTGAAAG 18 00 

AAGTAAATGA TACACTTCTG GTGAATGAAT TGAAATCAAA AGAATCTGAC ATCATGACAA 186 0 

CAAATGGTGT AATTCATGTT GTAGATAAAC TCCTCTATCC AG C AG AC AC A CCTGTTGGAA 192 0 

ATGATCAACT GCTGGAAATA CTTAATAAAT TAATCAAATA CATCCAAATT AAGTTTGTTC 19 8 0 

GTGGTAGCAC CTTCAAAGAA ATCCCCGTGA CTGTCTATAC AACTAAAATT ATAACCAAAG 2 04 0 

TTGTGGAACC AAAAATTAAA GTGATTGAAG GCAGTCTTCA GCCTATTATC AAAACTGAAG 210 0 

GACCCACACT AACAAAAGTC AAAATTGAAG GTGAACCTGA ATTCAGACTG ATTAAAGAAG 216 0 

GTGAAACAAT AACTGAAGTG ATCCATGGAG AGCCAATTAT TAAAAAATAC ACCAAAATCA 222 0 

TTGATGGAGT GCCTGTGGAA ATAACTGAAA AAGAGACACG AGAAGAACGA ATCATTACAG 22 8 0 

GTCCTGAAAT AAAATACACT AGGATTTCTA CTGGAGGTGG AGAAACAGAA GAAACTCTGA 2 34 0 

AGAAATTGTT ACAAGAAGAG GTCACCAAGG TCACCAAATT CATTGAAGGT GGTGATGGTC 24 0 0 

ATTTATTTGA AGATGAAGAA ATTAAAAGAC TGCTTCAGGG AGACACACCC GTGAGGAAGT 24 6 0 

TGCAAGCCAA CAAAAAAGTT CAAGGTTCTA GAAGACGATT AAGGGAAGGT CGTTCTCAGT 252 0 

GAAAATCCAA AAACCAGAAA AAAATGTTTA TACAACCCTA AGTCAATAAC CTGACCTTAG 258 0 

AAAATTGTGA GAGCCAAGTT GACTTCAGGA ACTGAAACAT CAGCACAAAG AAGCAATCAT 264 0 

CAAATAATTC TGAACACAAA TTTAATATTT TTTTTTCTGA ATGAGAAACA TGAGGGAAAT 2 70 0 

TGTGGAGTTA GCCTCCTGTG GTAAAGGAAT TGAAGAAAAT ATAACACCTT ACACCCTTTT 276 0 

TCATCTTGAC ATTAAAAGTT CTGGCTAACT TTGGAAT CCA TTAGAGAAAA ATCCTTGTCA 2 82 0 

CCAGATTCAT TACAATTCAA ATCGAAGAGT TGTGAACTGT TATCCCATTG AAAAGAC CG A 2 88 0 

GCCTTGTATG TATGTTATGG ATACATAAAA TGCACGCAAG CCATTATCTC TCCATGGGAA 2 94 0 

GCTAAGTTAT AAAAATAGGT GCTTGGTGTA CAAAACTTTT TATATCAAAA GGCTTTGCAC 3 000 

ATTTCTATAT GAGTGGGTTT ACTGGTAAAT TATGTTATTT TTTACAACTA ATTTTGTACT 306 0 

CTCAGAATGT TTGTCATATG CTTCTTGCAA TGCATATTTT TTAATCTCAA ACGTTTCAAT 312 0 

AAAACCATTT TT C AG AT ATA AAGAGAATTA CTTCAAATTG AGTAATTCAG AAAAACTCAA 3180 
GATTTAAGTT AAAAAGTGGT TTGGACTTGG GAA 



BCA4 Protein sequence (SEQ ID NO: 2) 

Gene name: osteoblast specific factor 2 (periostin); Unigene number: Hs. 136348; Probeset 
Accession #: D13666; Protein Accession #: NPJ306466; Predicted Signal sequence: 1-21; TM 
domains: none; PFAM domains: f asciclin_domains : 94-232, 234-367, 496-630; Summary: a 
secreted protein involved in adhesion and osteoblast development; may participate in 
preferential metastasis of breast cancer to bone. 

MIPFLPMFSL LLLLIVNPIN ANNHYDKILA HSRIRGRDQG PNVCALQQIL GTKKKYFSTC 60 
KNWYKKSICG QKTTVLYECC PGYMRMEGMK GCPAVXrPIDH VYGT LG I VGA TTTQRYSDAS 12 0 
KLREEIEGKG SFTYFAPSNE AWDNLDSDIR RGLESNVNVE LLNALHSHMI NKRMLTKDLK 180 
NGMI I PSMYN NLGLFINHYP NGWTVNCAR IIHGNQIATN GWHVIDRVL TQIGTSIQDF 24 0 
IEAEDDLSSF RAAAITSDIL EALG RDGH FT LFAPTNEAFE KLPRGVLERF MGDKVASEAL 3 00 
MKYHILNTLQ CSESIMGGAV FETLEGNTIE IGCDGDSITV NGIKMVNKKD IVTNNGVIHL 36 0 
IDQVLIPDSA KQVIELAGKQ QTTFTDLVAQ LGLASALRPD GEYTLLAPVN NAFSDDTLSM 4 20 
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VQRLLKLILQ NHILKVKVGL NELYNGQILE TIGGKQLRVF VYRTAVCIEN SCMEKGSKQG 4 80 

RNGAIHIFRE IIKPAEKSLH EKLKQDKRFS TFLSLLEAAD LKELLTQPGD WTLFVPTNDA 54 0 

FKGMTSEEKE ILIRDKNALQ NIILYHLTPG VFIGKGFEPG VTNILKTTQG SKIFLKEVND 600 

TLLVNELKSK ESDIMTTNGV IHWDKLLYP ADTPVGNDQL LEILNKLIKY IQIKFVRGST 660 

FKEIPVTVYT TKIITKWEP KIKVIEGSLQ PIIKTEGPTL TKVKIEGEPE FRLIKEGETI 720 

TEVIHGEPII KKYTKI IDGV PVE ITEKETR EERIITGPEI KYTRI STGGG ETEETLKKLL 780 
QEEVTKVTKF IEGGDGHLFE DEEIKRLLQG DTPVRKLQAN KKVQGSRRRL REGRSQ 

BCA7 DNA sequence (SEQ ID NO:3) ^ ^ 

Gene name: 5T4 oncofetal trophoblast glycoprotein; Unigene number: Hs . 82128; Probeset 
Accession #: Z29083; Nucleic Acid Accession #: NM_006670; Coding sequence: 85-1347 (start 
and stop codons underlined) 

CCGGCTCGCG CCCTCCGGGC CCAGCCTCCC GAGCCTTCGG AGCGGGCGCC GTCCCAGCCC 60 

AGCTCCGGGG AAACGCGAGC CGCGATGCCT GGGGGGTGCT CCCGGGGCCC CGCCGCCGGG 12 0 

GACGGGCGTC TGCGGCTGGC GCGACTAGCG CTGGTACTCC TGGGCTGGGT CTCCTCGTCT 180 

TCTCCCACCT CCTCGGCATC CTCCTTCTCC TCCTCGGCGC CGTTCCTGGC TTCCGCCGTG 24 0 

TCCGCCCAGC CCCCGCTGCC GGACCAGTGC CCCGCGCTGT GCGAGTGCTC CGAGGCAGCG 3 00 

CGCACAGTCA AGTGCGTTAA CCGCAATCTG AC CGAGGTGC CCACGGACCT GCCCGCCTAC 360 

GTGCGCAACC TCTTCCTTAC CGGCAACCAG CTGGCCGTGC TCCCTGCCGG CGCCTTCGCC 4 20 

CGCCGGCCGC CGCTGG CGG A GCTGGCCGCG CTCAACCTCA GCGGCAGCCG CCTGGACGAG 4 80 

GTGCGCGCGG GCGCCTTCGA GCATCTGCCC AGCCTGCGCC AGCTCGACCT CAGCCACAAC 54 0 

CCACTGGCCG ACCTCAGTCC CTTCGCTTTC TCGGGCAGCA ATGCCAGCGT CTCGGCCCCC 6 00 

AGTCCCCTTG TGGAACTGAT CCTGAACCAC ATCGTGCCCC CTGAAGATGA GCGGCAGAAC 6 60 

CGGAGCTTCG AGGGCATGGT GGTGGCGGCC CTGCTGGCGG GCCGTGCACT GCAGGGGCTC 72 0 

CGCCGCTTGG AGCTGG CCAG CAACCACTTC CTTTACCTGC CGCGGGATGT GCTGGCCCAA 780 

CTGCCCAGCC TCAGGCACCT GGACTTAAGT AATAATTCGC TGGTGAGCCT GACCTACGTG 84 0 

TCCTTCCGCA ACCTGACACA TCTAGAAAGC CTCCACCTGG AGGACAATGC CCTCAAGGTC 90 0 

CTTCACAATG GCACCCTGGC TGAGTTGCAA GGTCTACCCC ACATTAGGGT TTTCCTGGAC 96 0 

AACAATCCCT GGGTCTG CGA CTGCCACATG GCAGACATGG TGACCTGGCT CAAGGAAACA 1020 

GAGGTAGTGC AGGGCAAAGA CCGGCTCACC TGTGCATATC CGGAAAAAAT GAGGAATCGG 1080 

GTCCTCTTGG AACTCAACAG TGCTGACCTG GACTGTGACC CGATTCTTCC CCCATCCCTG 114 0 

CAAACCTCTT ATGTCTTCCT GGGTATTGTT TTAGCCCTGA TAGGCGCTAT TTTCCTCCTG 12 0 0 

GTTTTGTATT TGAACCGCAA GGGGATAAAA AAGTGGATGC ATAACATCAG AGATGCCTGC 126 0 

AGGGATCACA TGGAAGGGTA TCATTACAGA TATGAAATCA ATGCGGACCC CAGATTAACA 13 2 0 

AACCTCAGTT CTAACTCGGA TGTCTGAGAA ATATTAGAGG ACAGACCAAG GACAACTCTG 13 80 

CATGAGATGT AGACTT AAG C TTTATCCCTA CTAGGCTTGC TCCACTTTCA TCCTCCACTA 14 4 0 

TAGATACAAC GGACTTTGAC TAAAAGCAGT GAAGGGGATT TGCTTCCTTG TTATGTAAAG 1500 

TTTCTCGGTG TGTTCTGTTA ATGTAAGACG ATGAACAGTT GTGTATAGTG TTTTACCCTC 156 0 

TTCTTTTTCT TGGAACTCCT CAACACGTAT GGAGGGATTT TTCAGGTTTC AGCATGAACA 1620 

TGGGCTTCTT GCTGTCTGTC TCTCTCTCAG TACAGTTCAA GGTGTAGCAA GTGTACCCAC 1680 

ACAGAT AG CA TTCAACAAAA GCTGCCTCAA CTTTTTCGAG AAAAATACTT TATTCATAAA 174 0 

TATCAGTTTT ATTCTCATGT AC CT AAGTTG TGGAGAAAAT AATTG CATC C T AT AAACTG C 1800 

CTGCAGACGT TAGCAGGCTC TTCAAAATAA CTCCATGGTG CACAGGAG CA CCTGCATCCA 1860 

AGAGCATGCT TACATTTTAC TGTTCTGCAT ATTACAAAAA ATAACTTGCA ACTTCATAAC 1920 

TTCTTTGACA AAGTAAATTA CTTTTTTGAT TGCAGTTTAT ATGAAAATGT ACTGATTTTT 1980 

TTTTAATAAA CTGCATCGAG ATCCAACCGA CTGAATTGTT AAAAAAAAAA AAAAATAAAG 2 04 0 
ATTCTTAAAA GAA 

BCA7 Protein sequence (SEQ ID NO: 4) QO , OQ D ^ hDCP ^ 
Gene name: 5T4 oncofetal trophoblast glycoprotein; Unigene number: Hs. 82128; Probeset 
Accession #: Z29083; Protein Accession «: NP_006661; Predicted Signal sequence: 1'32; 
Predicted TM domains : 357-373; PF AM domains : leucine_nch_repeat s : 61-90, 119-142, 143-166, 

Nummary: Ttyje' la^protein of unknown function, detected in multiple cancers, with highest 
expression in breast cancer. 

MPGGCSRGPA AGDGRLRLAR LALVLLGWVS SSSPTSSASS FSSSAPFLAS AVSAQPPLPD 60 

QCPALCECSE AARTVKCVNR NLTEVPTDLP AYVRNLFLTG NQIiAVL PAG A FARRPPLAEL 120 

AALNLSGSRL DEVRAGAFEH LPSLRQLDLS HNPLADLSPF AFSGSNASVS APSPLVELIL 180 

NHIVPPEDER QNRSFEGMW AALLAGRALQ GLRRLELASN HFLYLPRDVL AQLPSLRHLD 24 0 

LSNNSLVSLT YVS FRNLTHL ESLHLEDNAL KVLHNGTLAE LQGLPHIRVF LDNNPWVCDC 3 00 

HMADMVTWLK ETEWOGKDR LTCAYPEKMR NRVLLELNSA DLDCDPILPP SLQTSYVFLG 360 
IVLALIGAIF LLVLYLNRKG IKKWMHNIRD ACRDHM EG YH YRYE I NAD PR LTNLS SNSDV 

BCX5 DNA sequence (SEQ ID NO: 5) _„~,n^o -ki a^-iH 

Gene name: LNIR; Unigene number: Hs. 61460; Probeset Accession #: AA028028; Nucleic Acid 
Accession ft: AF160477; Coding sequence: 225-1757 (start and stop codons underlined) 

GGGGAGCTCG GAGCTCCCGA TCACGGCTTC TTGGGGGTAG CTACGGCTGG GTGTGTAGAA 6 0 

CGGGGCCGGG GCTGGGGCTG GGTCCCCTAG TGAGACCCAA GTGCGAGAGG CAAGAACTCT 12 0 

GCAGCTTCCT GCCTTCTGGG TCAGTTCCTT ATTCAAGTCT GCAGCCGGCT CC CAGGGAG A 18 0 

TCTCGGTGGA ACTTCAGAAA CGCTGGGCAG TCTGC CTTTC AACCATGCCC CTGTCCCTGG 24 0 

GAGCCGAGAT GTGGGGGCCT GAGGCCTGGC TGCTGCTGCT GCTACTGCTG GCATCATTTA 300 

CAGGCCGGTG CCCCGCGGGT GAGCTGGAGA CCTCAGACGT GGTAACTGTG GTGCTGGGCC 36 0 

AGGACGCAAA ACTGCCCTGC TTCTACCGAG GGGACTCCGG CGAGCAAGTG GGGCAAGTGG 420 
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CATGGGCTCG GGTGGACGCG GGCGAAGGCG CCCAGGAACT AGCGCTACTG CACTCCAAAT 48 0 

ACGGGCTTCA TGTGAGCCCG GCTTACGAGG GCCGCGTGGA GCAGCCGCCG CCCCCACGCA 54 0 

ACCCCCTGGA CGGCTCAGTG CTCCTGCGCA ACGCAGTGCA GGCGGATGAG GGCGAGTACG 6 00 

AGTGCCGGGT CAGCACCTTC CCCGCCGGCA GCTTCCAGGC GCGGCTGCGG CTCCGAGTGA 66 0 

TGGTGCCTCC CCTGCCCTCA CTGAATCCTG GTCCAGCACT AGAAGAGGGC CAGGGCCTGA 72 0 

CCCTGGCAGC CTCCTGCACA GCTGAGGGCA GCCCAGCCCC CAGCGTGACC TGGGACACGG 7 80 

AGGTCAAAGG CACAACGTCC AGCCGTTCCT TCAAGCACTC CCGCTCTGCT GCCGTCACCT 84 0 

CAGAGTTCCA CTTGGTGCCT AGCCG CAGC A TGAATGGGCA GCCACTGACT TGTGTGGTGT 900 

CCCATCCTGG CCTGCTCCAG GACCAAAGGA TCACCCACAT CCTCCACGTG TCCTTCCTTG 96 0 

CTGAGGCCTC TGTGAGGGGC CTTGAAGACC AAAATCTGTG GCACATTGGC AGAGAAGGAG 102 0 

CTATGCTCAA GTGCCTGAGT G AAGGGC AG C CCCCTCCCTC ATACAACTGG ACACGGCTGG 1080 

ATGGG CCTCT GCCCAGTGGG GTACGAGTGG ATGGGGACAC TTTGGGCTTT CCCCCACTGA 114 0 

CCACTGAGCA CAGCGGCATC TACGTCTGCC ATGTCAGCAA TGAGTTCTCC TCAAGGGATT 12 00 

CTCAGGTCAC TGTGGATGTT CTTGACCCCC AGGAAGACTC TGGGAAGCAG GTGGACCTAG 12 6 0 

TGTCAGCCTC GGTGGTGGTG GTGGGTGTGA TCGCCGCACT CTTGTTCTGC CTTCTGGTGG 13 2 0 

TGGTGGTGGT GCTCATGTCC CGATACCATC GGCGCAAGGC CCAGCAGATG ACCCAGAAAT 13 80 

ATGAGGAGGA GCTGACCCTG ACCAGGGAGA ACTCCATCCG GAGGCTGCAT TCCCATCACA 144 0 

CGGACCCCAG GAGCCAGCCG GAGGAGAGTG TAGGGCTGAG AGCCGAGGGC CACCCTGATA 15 00 

GTCTCAAGGA CAACAGTAGC TGCTCTGTGA TGAGTGAAGA GCCCGAGGGC CGCAGTTACT 156 0 

CCACGCTGAC CACGGTGAGG GAGATAGAAA CACAGACTGA ACTG CTGTCT CCAGGCTCTG 16 2 0 

GGCGGGCCGA GGAGGAGGAA GATCAGGATG AAGGCATCAA ACAGGCCATG AACCATTTTG 16 80 

TTCAGGAGAA TGGGACCCTA CGGGCCAAGC CCACGGGCAA TGGCATCTAC ATCAATGGGC 174 0 

GGGGACACCT GGTCTGACCC AGGCCTGCCT CCCTTCCCTA GGCCTGGCTC CTTCTGTTGA 1800 

CATGGGAGAT TTTAGCTCAT CTTGGGGGCC TCCTTAAACA CCCCCATTTC TTGCGGAAGA 186 0 

TGCTCCCCAT GCCACTGACT GCTTGACCTT TACCTCCAAC CCTTCTGTTC ATCGGGAGGG 1920 

CTCCACCAAT TGAGTCTCTC CCACCATGCA TGCAGGTCAC TGTGTGTGTG CATGTGTGCC 198 0 

TGTGTGAGTG TTGACTGACT GTGTGTGTGT GGAGGGGTGA CTGTCCGTGG AGGGGTGACT 2 04 0 

GTGTCCGTGG TGTGTATTAT GCTGTCATAT CAGAGTCAAG TGAACTGTGG TGTATGTGCC *210 0 

ACGGGATTTG AGTGGTTGCG TGGGCAACAC TGTCAGGGTT TGGCGTGTGT GTCATGTGGC 216 0 

TGTGTGTGAC CTCTGCCTGA AAAAGCAGGT ATTTTCTCAG ACCCCAGAGC AG T ATT AATG 222 0 

ATGCAGAGGT TGGAGGAGAG AGGTGGAGAC TGTGGCTCAG ACCCAGGTGT GCGGGCATAG 22 8 0 

CTGGAGCTGG AATCTGCCTC CGGTGTGAGG GAACCTGTCT CCTACCACTT CGGAGCCATG 234 0 

GGGGCAAGTG TGAAGCAGCC AGTCCCTGGG TCAGCCAGAG GCTTGAACTG TTACAGAAGC 24 00 

CCTCTGCCCT CTGGTGGCCT CTGGGCCTGC TGCATGTACA TATTTTCTGT AAATATACAT 24 60 

GCGCCGGGAG CTTCTTGCAG GAATACTGCT CCGAATCACT TTTAATTTTT TTCTTTTTTT 252 0 

TTTCTTGCCC TTTCCATTAG TTGTATTTTT TATTTATTTT TATTTTTATT TTTTTTTAGA 2580 

GATGGAGTCT CACTATGTTG CTCAGGCTGG CCTTGAACTC CTGGGCTCAA GCAATCCTCC 264 0 

TGCCTCAGCC TC CCTAGT AG CTGGGACTTT AAGTGTACAC CACTGTGCCT G CTTTGAATC 2 700 

CTTTACGAAG AGAAAAAAAA AATTAAAGAA AGCCTTTAGA TTTATCCAAT GTTTACTACT 276 0 

GGGATTGCTT AAAGTGAGGC CCCTCCAACA CCAGGGGGTT AATTCCTGTG ATTGTGAAAG 2 820 

GGGCTACTTC CAAGGCATCT TCATGCAGGC AGCCCCTTGG GAGGGCACCT GAGAGCTGGT 2 88 0 

AGAGTCTGAA ATTAGGGATG TGAGCCTCGT GGTTACTGAG TAAGGTAAAA TTGCATCCAC 2 94 0 

CATTGTTTGT GATACCTTAG GGAATTGCTT GGACCTGGTG ACAAGGGCTC CTGTTCAATA 3 000 

GTGGTGTTGG GGAGAGAGAG AGCAGTGATT ATAGACCGAG AGAGTAGGAG TTGAGGTGAG 3 06 0 

GTGAAGGAGG TGCTGGGGGT GAGAATGTCG CCTTTCCCCC TGGGTTTTGG ATCACTAATT 312 0 

CAAGGCTCTT CTGGATGTTT CTCTGGGTTG GGGCTGGAGT TCAATGAGGT TTATTTTTAG 3180 

CTGGCCCACC CAGATACACT GAG CCAGAAT ACCTAGATTT AGTACCCAAA CTCTTCTTAG 324 0 

TCTGAAATCT GCTGGATTTC TGGCCTAAGG GAGAGGCTCC CATCCTTCGT TCCCCAGCCA 3 3 00 

GCCTAGGACT TCGAATGTGG AGCCTGAAGA TCTAAGATCC TAACATGTAC ATTTTATGTA 336 0 

AATATGTGCA TATTTGTACA TAAAATGATA TTCTGTTTTT AAATAAACAG ACAAAACTTG 34 20 

TTCAAAAAAA AAAAAAAAAA AAAAAAAAA 

BCX5 Protein sequence (SEQ ID NO: 6) 

Gene name: LNIR; Unigene number: Hs. 61460; Probeset Accession #: AA028028; Protein 
Accession #: AF160477; Predicted Signal sequence: 1-26; Predicted TM domains: 355-371; PFAM 
domains: IgSF_domain: 45-129, 162-225, 263-317; Summary: A type la TM protein; is a member 
of the immunoglobulin superfamily. 

MPLSLGAEMW GPEAWLLLLL LLASFTGRCP AGE LETS DW TWLGQDAKL PCFYRGDSGE 60 

QVGQVAWARV DAGEGAQELA LLHSKYGLHV SPAYEGRVEQ PPPPRNPLDG SVLLRNAVQA 120 

DEGEYECRVS TFPAGSFQAR LRLRVMVPPL PSLNPGPALE EGQGLTLAAS CTAEGSPAPS 180 

VTWDTEVKGT TSSRSFKHSR SAAVTSEFHL VPSRSMNGQP LTCWSHPGL LQDQRITHIL 24 0 

HVSFLAEASV RGLEDQNLWH IGREGAMLKC LSEGQPPPSY NWTRLDGPLP SGVRVDGDTL 3 00 

GFPPLTTEHS GIYVCHVSNE FSSRDSQVTV DVLDPQEDSG KQVDLVSASV WVGVIAALL 3 60 

FCLLVWWL MSRYHRRKAQ QMTQKYEEEL TLTRENSIRR LHSHHTDPRS QPEESVGLRA 4 20 

EGHPDSLKDN SSCSVMSEEP EGRSYSTLTT VREIETQTEL LSPGSGRAEE EEDQDEGIKQ 4 80 
AMNH FVQENG TLRAKPTGNG IYINGRGHLV 

mouse BCX5 Protein sequence (SEQ ID NO: 7) 

Gene name: mouse_LNIR; Unigene number: n/a; Probeset Accession ft: BF16 8327; Protein 
Accession ft: n/a; Predicted Signal sequence: 1-27; Predicted TM domains: 346-362; PFAM 
domains: lgSF_domains : 44-126 , 166-221, 259-313 ; Summary: This is the mouse orthologue of human 
BCX5; it is a type la TM protein of unknown function. 

MPLSLGAEMW GPEAWLRLLF LASFTGQYSA GELETSDWT WLGQDAKLP CFYRGDPDEQ 60 
VGQVAWARVD PNEXYPGAGL LHSKYGLHVN PAYEDRVEQX XHETFRRSVL LRN AVQAD EG 12 0 
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EYECRVSTFP SGSFQARMRL RVLVPPLPSL NPGPPLEEGQ ADVAASCTAE GSPAPSVTWD 180 

TEVKGTQSSR SFTHPRSAAV TSEFHLVPSR SMNGQPLTCV VSHPGLLQDR RITHTLQVAF 24 0 

LAEASVRGLE DQNLWQVGRE GATLKCLSEG QPPPKYNWTR LDGPLPSGVR VKGDTLGFPP 3 00 

LTTEHSGVYX CHVSNELSSR DSQVTVEVLD PEDPGKQVDL VSASVIIVGV IAALLFCLLV 360 

WWLMSRYH RRKAQQMTQK YEEELTLTRE NSIRRLHSHH SDPRSQPEES VGLRAEGHPD 420 

SLKDNSSCSV MSEEPEGRSY STLTTVREIE TQTELLSPGS GRTEEDDDQD EGIKQAMNHL 4 80 
CRKMGP 

BCZ6 DNA sequence (SEQ ID NO: 8) 

Gene name: IL-6 receptor beta chain (gpl30; oncostatin M receptor); Unigene number: 

Hs. 82065; Probeset Accession #: M57230 / AA406546; Nucleic Acid Accession : NM_GG2184 ; 

Coding sequence: 256-3012 (start and stop codons underlined) 

GAGCAGCCAA AAGGCCCGCG GAGTCGCGCT GGGCCGCCCC GGCGCAGCTG AAC CGGGGGC 60 

CGCGCCTGCC AGGCCGACGG GTCTGGCCCA GCCTGGCGCC AAGGGGTTCG TGCGCTGTGG 12 0 

AGACGCGGAG GGTCGAGGCG GCGCGGCCTG AGTGAAACCC AATGGAAAAA GCATGACATT 180 

TAGAAGTAGA AGACTTAGCT TCAAATCCCT ACTCCTTCAC TTACTAATTT TGTGATTTGG 24 0 

AAATATCCGC GCAAGATGTT GACGTTGCAG ACTTGGGTAG TGCAAGCCTT GTTTATTTTC 3 00 

CTCACCACTG AATCTACAGG TGAACTTCTA GATCCATGTG GTTATATCAG TCCTGAATCT 36 0 

CCAGTTGTAC AACTTCATTC TAATTTCACT GCAGTTTGTG TGCTAAAGGA AAAATGTATG 42 0 

GATTATTTTC ATGTAAATGC TAATTACATT GTCTGGAAAA CAAACCATTT TACT ATT CCT 480 

AAGGAGCAAT ATACTATCAT AAACAGAACA GCATCCAGTG TCACCTTTAC AGATATAGCT 54 0 

TCATTAAATA TTCAGCTCAC TTGCAACATT CTTACATTCG GACAGCTTGA ACAGAATGTT 600 

TATGGAATCA CAATAATTTC AGGCTTGCCT CCAGAAAAAC CTAAAAATTT GAGTTGCATT 66 0 

GTGAACGAGG GGAAGAAAAT GAGGTGTGAG TGGGATGGTG GAAGGGAAAC ACACTTGGAG 720 

ACAAACTTCA CTTTAAAATC TGAATGGGCA ACACACAAGT TTGCTGATTG CAAAGCAAAA 7 80 

CGTGACACCC CCACCTCATG CACTGTTGAT TATTCTACTG TGTATTTTGT CAACATTGAA 84 0_ 

GTCTGGGTAG AAGCAGAGAA TGCCCTTGGG AAGGTTACAT CAGATCATAT CAATTTTGAT 90 0 

CCTGTATATA AAGTGAAGCC CAATCCGCCA CATAATTTAT CAGTGATCAA CTCAGAGGAA 96 0 

CTGTCTAGTA TCTTAAAATT GACATGGACC AACCCAAGTA TTAAGAGTGT TATAATACTA 1020 

AAATATAACA TTCAATATAG GACCAAAGAT GCCTCAACTT GG AG CCAGAT TCCTCCTGAA 108 0 

GACACAGCAT CCACCCGATC TTCATTCACT GTCCAAGACC TTAAACCTTT TACAGAATAT 114 0 

GTGTTTAGGA TTCGCTGTAT GAAGGAAGAT GGTAAGGGAT ACTGGAGTGA CTGGAGTGAA 12 0 0 

GAAGCAAGTG GGATCACCTA TGAAGATAGA CCATCTAAAG CACCAAGTTT CTGGTATAAA 12 60 

ATAGATCCAT CCCATACTCA AGG CTACAGA ACTGTACAAC TCGTGTGGAA GACATTGCCT 132 0 

CCTTTTGAAG CCAATGGAAA AATCTTGGAT TATGAAGTGA CTCTCACAAG ATGGAAATCA 13 80 

CATTTACAAA ATTACACAGT TAATGCCACA AAACTGACAG TAAATCTCAC AAATGATCGC 144 0 

TATCTAGCAA CCCTAACAGT AAGAAATCTT GTTGGCAAAT CAGATGCAGC TGTTTTAACT 1500 

ATCCCTGCCT GTGACTTTCA AGCTACTCAC CCTGTAATGG ATCTTAAAGC ATTCCCCAAA 156 0 

GATAACATGC TTTGGGTGGA ATGGACTACT CCAAGGGAAT CTGTAAAGAA ATATATACTT 162 0 

GAGTGGTGTG TGTTATCAGA TAAAGCACCC TGTATCACAG ACTGGCAACA AGAAGATGGT 16 80 

ACCGTGCATC GCACCTATTT AAGAGGGAAC TTAGCAGAGA GCAAATGCTA TTTGATAACA 174 0 

GTTACTCCAG TATATGCTGA TGGACCAGGA AGCC CTGAAT CCATAAAGGC ATACCTTAAA 1800 

CAAGCTCCAC CTTCCAAAGG ACCTACTGTT CGGACAAAAA AAGTAGGGAA AAACGAAGCT 1860 

GTCTTAGAGT GGGACCAACT TCCTGTTGAT GTTCAGAATG GATTTATCAG AAATTATACT 192 0 

ATATTTTATA GAACCATCAT TGGAAATGAA ACTGCTGTGA ATGTGGATTC TTCCCACACA 198 0 

GAATATACAT TGTCCTCTTT GACTAGTGAC ACATTGTACA TGGTACGAAT GGCAGCATAC 2 04 0 

ACAGATGAAG GTGGGAAGGA TGGTCCAGAA TTCACTTTTA CTACCCCAAA GTTTGCTCAA 2100 

GGAGAAATTG AAGCCATAGT CGTGCCTGTT TGCTTAGCAT TCCTATTGAC AACTCTTCTG 216 0 

GGAGTGCTGT TCTGCTTTAA TAAGCGAGAC CTAATTAAAA AACACATCTG GCCTAATGTT 222 0 

CCAGATCCTT CAAAGAGTCA TATTGCCCAG TGGTCACCTC ACACTCCTCC AAGGCACAAT 22 80 

TTTAATTCAA AAGATCAAAT GTATTCAGAT GGCAATTTCA CTGATGTAAG TGTTGTGGAA 2 34 0 

ATAGAAGCAA ATGACAAAAA GCCTTTTCCA GAAGATCTGA AATCATTGGA CCTGTTCAAA 24 00 

AAGGAAAAAA TTAATACTGA AGGACACAGC AGTGGTATTG GGGGGTCTTC ATGCATGTCA 246 0 

TCTTCTAGGC CAAGCATTTC TAG CAGTG AT GAAAATGAAT CTTCACAAAA C ACTTCG AG C 2 520 

ACTGTCCAGT ATTCTACCGT GGTACACAGT GG CTACAGAC ACCAAGTTCC GTCAGT CCAA 258 0 

GTCTTCTCAA GATCCGAGTC TACCCAGCCC TTGTTAGATT CAGAGG AG CG GCCAGAAGAT 264 0 

CTACAATTAG TAGATCATGT AGATGGCGGT GATGGTATTT TGCCCAGGCA ACAGTACTTC 2700 

AAACAGAACT GCAGTCAGCA TGAATCCAGT CCAGATATTT CACATTTTGA AAGGTCAAAG 276 0 

CAAGTTTCAT CAGTCAATGA GGAAGATTTT GTTAGACTTA AACAGCAGAT TTCAGATCAT 2 82 0 

ATTTCACAAT CCTGTGGATC TGGGCAAATG AAAATGTTT C AGGAAGTTTC TGCAGCAGAT 2 880 

GCTTTTGGTC CAGGTACTGA GGGACAAGTA GAAAGATTTG AAACAGTTGG CATGGAGGCT 2 94 0 

GCGACTGATG AAGGCATGCC TAAAAGTTAC TT AC CACAG A CTGTACGGCA AGGCGGCTAC 3 00 0 

ATGCCTCAG T GAA GGACTAG TAGTTCCTGC TACAACTTCA GCAGTACCTA TAAAGTAAAG 3 060 
CTAAAATGAT TTTATCTGTG AATTC 

BCZ6 Protein sequence (SEQ ID NO: 9) 

Gene name: IL-6 receptor beta chain (gpl30; oncostatin M receptor); Unigene number : 

Hs. 82065; Probeset Accession #: M57230 / AA406546; Protein Accession #: NP_002175; Predicted 

Signal sequence: 1-22; Predicted TM domains: 625-641; PFAM domains: 

f ibronectin_type_III_domains: 222-311, 424-509, 519-606; Summary: A type I TM protein; it 
homodimerizes or heterodimerized to make a functional receptor for IL-6, oncostatin-M, IL-11, 
LIF , and CNTF . 

MLTLQTWWQ ALFIFLTTES TGELLDPCGY ISPESPWQL HSNFTAVCVL KEKCMDYFHV 60 
NANY I VWKTN HFTIPKEQYT IINRTASSVT FTDIASLNIQ LTCNILTFGQ LEQNVYGITI 12 0 
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ISGLPPEKPK NLSCIVNEGK KMRCEWDGGR ETHLETNFTL KSEWATHKFA DCKAKRDTPT 180 

SCTVDYSTVY FVNIEVWVEA ENALrGKVTSD HINFDPVYKV KPNPPHNLSV INSEELSSIL 24 0 

KLTWTNPSIK SVIILKYNIQ YRTKDASTWS QIPPEDTAST RSSFTVQDL.K PFTEYVFRIR 300 

CMKEDGKGYW SDWSEEASGI TYEDRPSKAP SFWYKIDPSH TQGYRTVQLV WKTLPPFEAN 360 

GKILDYEVTL TRWKSHLQNY TVNATKLTVN LTNDRYLATL TVRNLVGKSD AAVLTIPACD 420 

FQATHPVMDL KAFPKDNMLW VEWTTPRESV KKYILEWCVL SDKAPCITDW QQEDGTVHRT 4 80 

YLRGNLAESK CYLITVTPVY ADGPGSPESI KAYLKQAPPS KG PTVRTKKV GKNEAVL.EWD 54 0 

QLPVDVQNGF IRNYTIFYRT I I GNETAVNV DSSHTEYTLS SLTSDTLYMV RMAAYTDEGG 600 

KDGPEFTFTT PKFAQGEIEA IWPVCLAFL LTTLLGVLFC FNKRDLI KKH IWPNVPDPSK 660 

SHIAQWSPHT PPRHNFNSKD QMYSDGNFTD VS WE IE AND KKPFPEDLKS LDLFKKEKIN 720 

TEGHSSGIGG SSCMSSSRPS ISSSDENESS QNTSSTVQYS TWHSGYRHQ VPSVQVFSRS 780 

ESTQPLLDSE ERPEDLQLVD HVDGGDGI LP RQQYFKQNCS QHESSPDISH FERSKQVSSV 840 

NEEDFVRLKQ QISDHISQSC GSGQMKMFQE VSAADAFGPG TEGQVERFET VGMEAATDEG 900 
MPKSYLPQTV RQGGYMPQ 



BFG4 DMA sequence (SEQ ID NO: 10) 

Gene name: KIAA0882 protein; Unigene number: Hs. 90419; Probe set Accession #: Z3 9762 
Nucleic Acid Accession #: AB02 0689; Coding sequence: 108-2777 (start and stop codons 
underlined) 



GAACTTATGT AGCCTCATTA TCCCGCTCCG TGAGGTGACA ATTGTGGAAA AGG C AG AC AG 6 0 

CTCCAGTGTG CTCCCCAGTC CCTTATCACA TCAGCACCCG AAACAGG ATG ACCTTCCTAT 12 0 

TTGCCAACTT GAAAGATAGA GACTTTCTAG TGCAGAGGAT CTCAGATTTC CTGCAACAGA 180 

CTACTTCCAA AATATATTCT GACAAGGAGT TTGCAGGAAG TTACAACAGT TCAGATGATG 24 0 

AGGTGTACTC TCGACCCAGC AGCCTCGTCT CCTCCAGCCC CCAGAGAAGC ACGAGCTCTG 3 00 

ATGCTGATGG AG AG CGCCAG TTTAACCTAA ATGGCAACAG CGTCCCCACA GCCACACAGA 36 0 

CCCTGATGAC CATGTATCGG CGGCGGTCTC CCGAGGAGTT CAACCCGAAA TTGGCCAAAG 420 

AGTTTCTGAA AGAGCAAGCC TGGAAGATTC ACTTTGCTGA GTATGGGCAA GGGATCTGCA " 4 80 

TGTACCGCAC AGAGAAAACG CGGGAGCTGG TGTTGAAGGG CATCCCGGAG AGCATGCGTG 54 0 

GGGAGCTCTG GCTGCTGCTG TCAGGTGCCA TCAATGAGAA GGCCACACAT CCTGGGTACT 600 

ATGAAGACCT AGTGGAGAAG TCCATGGGGA AGTATAATCT CGCCACGGAG GAGATTGAGA 66 0 

GGGATTTACA CCGCTCCCTT CCAGAACACC CAGCTTTT CA GAATGAAATG GGCATTGCTG 720 

CACTAAGGAG AGTCTTAACA GCTTATGCTT TTCGAAATCC CAACATAGGG TATTGCCAGG 780 

CCATGAATAT TGTCACTTCA GTGCTGCTGC TTTATGCCAA AGAGGAGGAA GCTTTCTGGC 840 

TGCTTGTGGC TTTGTGTGAG CGCATGCTCC CAGATTACTA CAACACCAGA GTTGTGGGTG 900 

CACTGGTGGA CCAAGGTGTC TTTGAGGAGC TAGCACGAGA CTACGTCCCA CAGCTGTACG 960 

ACTGCATGCA AGACCTGGGC GTGATTTCCA CCATCTCCCT GTCTTGGTTC CTCACACTAT 1020 

TTCTCAGTGT GATGCCTTTT GAGAGTGCAG TTGTGGTTGT TGACTGTTTC TTCTATGAAG 108 0 

GAATTAAAGT GATATTCCAG TTGGCCCTAG CTGTG CTGGA TGCAAATGTG GACAAACTGT 114 0 

TGAACTGCAA GGATGATGGG GAGGCCATGA CCGTTTTGGG AAGGTATTTA GACAGTGTGA 1200 

CCAATAAAGA CAGCACACTG CCTCCCATTC CTCACCTCCA CTCCTTGCTC AGCGATGATG 1260 

TGGAACCTTA CCCTGAGGTA GACATCTTTA GACTCATCAG AACTTCCTAC GAGAAATTCG 132 0 

GAACTATCCG GGCAGATTTG ATTGAACAGA TGAGATTCAA ACAGAGACTG AAAGTGATCC 13 80 

AGACGCTGGA GGATACTACG AAACGCAACG TGGTACGAAC CATTGTGACA GAAACTTCCT 144 0 

TTACCATTGA TGAGCTGGAA GAACTTTATG CTCTTTTCAA GGCAGAACAT CTCACCAGCT 1500 

GCTACTGGGG CGGGAGCAGC AACGCGCTGG ACCGGCATGA CCCCAGCCTG CCCTACCTGG 156 0 

AACAGTATCG CATTGACTTC GAG CAGTTCA AGGGAATGTT TGCTCTTCTC TTTC CTTGGG 162 0 

CATGTGGAAC TCACTCTGAC GTTCTGGCCT CCCGCTTGTT CCAGTTATTA GATGAAAATG 1680 

GAGACTCTTT GATTAACTTC CGGGAGTTTG TCTCTGGGCT AAGTGCTGCA TGCCATGGGG 174 0 

ACCTCACAGA GAAGCTCAAA CTCCTGTACA AAATGCACGT CTTGCCTGAG CCATCCTCTG 1800 

ATCAAGATGA ACCAGATTCT GCTTTTGAAG CAACTCAGTA CTTCTTTGAA GAT ATT AC CC 186 0 

CAGAATGTAC ACATGTTGTT GGATTGGATA GCAGAAGCAA ACAGGGTGCA GATGATGGCT 1920 

TTGTTACGGT GAGCCTAAAG CCAGACAAAG GGAAG AG AG C AAATTC CCAA GAAAATCGTA 1980 

ATTATTTGAG ACTGTGGACT CCAGAAAATA AATCTAAGTC AAAGAATGCA AAGGATTTAC 2 04 0 

CCAAATTAAA TCAGGGGCAG TTCATTGAAC TGTGTAAGAC AATGTATAAC ATGTTC AG CG 2100 

AAGACCCCAA TG AG CAGG AG CTGTACCATG CCACGGCAGC AGTGACCAGC CTCCTGCTGG 2160 

AGATTGGGGA GGTCGGCAAG TTGTTCGTGG CCCAGCCTGC AAAGGAGGGC GGG AG CGG AG 2220 

GCAGTGGGCC GTCCTGCCAC CAGGGCATCC CAGGCGTGCT CTTCCCCAAG AAAGGG CCAG 2 280 

GCCAGCCTTA CGTGGTGGAG TCTGTTGAGC CCCTGCCGGC CAGCCTGGCC CCCGACAGCG 2340 

AGGAACACTC CCTTGGAGGA CAAATGGAGG ACATCAAGCT GGAGGACTCC TCGCCCCGGG 2400 

ACAACGGGGC CTGCTCCTCC ATGCTGATCT CTGACGACGA CACCAAGGAC GACAGCTCCA 24 60 

TGTCCTCATA CTCGGTG CTG AGTGCCGGCT CCCACGAGGA GGACAAGCTG CACTGCGAGG 2 520 

AAATCGGAGA GGACACGGTC CTGGTGCGGA GCGGCCAGGG CACGGCGGCA CTGCCCCGGA 2 580 

GCACCAGCCT GGACCGGGAC TGGGCCATCA CCTTCGAGCA GTTCCTGGCC TCCCTCTTAA 2 64 0 

CTGAGCCTGC CCTGGTCAAG TACTTTGACA AGCCCGTGTG CATGATGG CC AGGATTACCA 2 700 

GTGCAAAAAA CATCCGGATG ATGGGCAAGC CCCTCACCTC GGCCAGTGAC TATGAAATCT 276 0 

CGGCCATGTC CGGC TGA CAC GGGCGCCTTC CCGGGGGAGT GGGAGGAGAG GGAGGGGAGG 2 820 

GATTTTTTAT GTTCTTCTGT GTTGAGTTTT TTCTTTCTTT CTTTTAAATT AAATATTTAT 2 880 

TAGTACCTGG AATTGAAGCC TAGTGTTTTC ATAATGTAAT TCAATGAAAA CTGTTGGAGA 2 94 0 

AATATTTAAA CACCTCAATG TAGGTACATT ACACTCTTGT TGCGGGGAGG GGATTTACCA 3 000 

GAATACAGTT TATTTCGTGA ATTCTAAAAA ACAAAAAGAT GAATCTGTCA GTGATATGTG 3060 

TGTATTATAA CTTATTAATC TTGCTGTTGA GCTGTATACA TGGTTTAAAA AATAGTACTG 3120 

TTTAATGCTA AGTAAGGCAG CAGTCATTTG TGTATTCAGG CTTTTTAAAT AAAATTAGAG 3180 

CTGTAAGGAA AATGAAAAGC CACAAATGCA AGACTGTTCT TAAATGGAAG GCATAGTCAG 3 24 0 

CGAGGGTAAA TCCTATACCA CTTTAGGAAG TATTAAAAAT ATTTTTAAGA TTTGAAATAT 33 00 

ATTTCATAGA AGTCCTCTAT TCAAAATCAT ATTCCACAGA TGTTCCCCTT CAAAGGGAAA 3 360 
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ACATTTGGGG TTCTAAACAG TTATGAAAGT AAGTGATTTT TACATGATTC CAGAATAACA 34 20 

CTTGTATTGA CCAATTTAGA CAGATACCAG AC C AATTTTG CATTTAAGAA ATTGTT CTG A 34 80 

TTATTTACGT CAACTCATTA GAATTCAGTG AAAAGTAACA GTCTTTTGTC ACAGAGAATC 3540 

TGAAAGTAGC AGCAAAGACA GAGGGCTCAT GACAGGTTTT TGCTTTTGCT TTGCTTTTGT 3600 

TTTTGAAAGA GTAAAAGTAC TGATGCTTCT GATACTGGAT GTTTAGCTTC TTACTGCAAA 366 0 

AACATAAGTA AAACAGTCAA CTTTACCATT TCCGTATTCT CCATAGATTG AAGAAATTTA 3 720 

TACCACATAT CGCATATGAC CATCTTTCCA TCAAATCAAT GTAGAGATAA TGTAAACTGA 37 80 

AAAAAAATCT GCAAGATAAT GTAACTGAAT GTTTTAAAAA CAGAACTTGT CACTTTATAT 3 84 0 

AAAAGAATAG TATGCTCTAT TTCCTGAATG GATGTGGAAA TGAAAGCTAG CGCACCTGCA 3 900 

CTTTGAATTC TTGCTTCTTT TTTATTACTG TTATGATTTT GCTTTTTACA GATGTTGGAC 3 960 

GATTTTTTCT TCTGATTGTT GAATTCATAA TCATGGTCTC ATTTCCTTTG CTTCTTTGGA 4 020 

ATATTTCTTT CAACACATTC CTTTATTTTA TTATACATTG TGTCCTTTTT TTAGCTATTG 4 08 0 

CTGCTGTTGT TTTTTATTCT ATTTACAGGA TGATTTTTAA ACTGTCAAAT GAAGTAGTGT 414 0 

TAACCTCAAA TAGGCTAAAT GTGAACAAAT AAAATACAGC AAATACTCAG AAAAAAAAAA 4 200 
AAAAAAAAAA AAAAA 

BFG4 Protein sequence (SEQ ID NO: 11) 

Gene name: KIAA0882 protein; Unigene number: Hs. 90419; Probeset Accession #: Z39762; 

Protein Accession #: BAA74905; Signal sequence: none; Predicted TM domains: 302-318; PFAM 
domains: TBC_domain : 135-347; Summary: a Type II membrane protein/ likely localized to the 
peroxisome . 

MTFLFANLKD RDFLVQRISD FLQQTTSKIY SDKEFAGSYN SSDDEVYSRP SSLVSSSPQR 6 0 

STSSDADGER QFNLNGNSVP TATQTLMTMY RRRSPEEFNP KLAKEFLKEQ AWKIHFAEYG 120 

QGICMYRTEK TRELVLKGIP ESMRGELWLL LSGAINEKAT HPGYYEDLVE KSMGKYNLAT 18 0 

EEIERDLHRS LPEHPAFQNE MG I AALRRVL TAYAFRNPNI G YCQAMN I VT SVLLLYAKEE 24 0 

EAFWLiLVALC ERMLPDYYNT RWGALVDQG VFEELARDYV PQLYDCMQDL GVISTISLSW 3 00 _ 

FLTLFLSVMP FESAWWDC FFYEGIKVIF QLiAXiAVLDAN VDKLLNCKDD GEAMTVLGRY 360 

LDSVTNKDST LPPIPHLHSL LSDDVEPYPE VDIFRLIRTS YEKFGTIRAD LIEQMRFKQR 420 

LKVIQTLEDT TKRNWRTIV TETSFTIDEL EELYALFKAE HLTSCYWGGS SNALDRHDPS 4 80 

LPYLEQYRID FEQFKGMFAL LFPWACGTHS DVLASRLFQL LDENGDSLIN FREFVSGLSA 54 0 

ACHGDLTEKL KLLYKMHVLP EPSSDQDEPD SAFEATQYFF EDITPECTHV VGLDSRSKQG 600 

ADDGFVTVSL KPDKGKRANS QENRNYLRLW TPENKSKSKN AKDLPKLNQG QFIELCKTMY 660 

NMFSEDPNEQ ELYHATAAVT SLLLEIGEVG KL FVAQ PAKE GGSGGSGPSC HQGIPGVLFP 720 

KKGPGQPYW ESVEPLPASL APDSEEHSLG GQMEDIKLED SSPRDNGACS SMLISDDDTK 780 

DDSSMSSYSV LSAGSHEEDK LHCEEIGEDT VLVRSGQGTA ALPRSTSLDR DWAITFEQFL 840 
ASLLTEPALV KYFDKPVCMM ARITSAKNIR MMGKPLTSAS DYEISAMSG 

BCU7 DNA sequence (SEQ ID NO: 12) , 

Gene name: EST; Unigene number: Hs . 98558; Probeset Accession #: AA428062; Nucleic Acid 
Accession #: n/a; Coding sequence: 1-573 (stop codon underlined) 

TATTTTATTT TCCAGGCTAA AGCAAATGAA AGTTTGCTGG TATCAACACA GCCTGCCATA 6 0 

TTTTTCACAG CATGCAACAA TGGTGCTAGG ATAGCTATTT CTTACTGTAA TTGCCAGAGG 12 0 

CAGAAATGGT CTGGGTATAA GCTATTTCAT AAAAGCAGCT TTAAATTGTC AGTATTAAGG 180 

TTTTCATGTG GAAAGGTGTC ATTCAAAAAA AAAGTAATTG GCATACATAT TCCACATCAT 24 0 

CGATCCTCTC TGTGGTGTTA ATTTTTTTAT ATGACCAGTA GAAAAATTTT AATATTCTCA 300 

CAATATAGGT TTTGGGGCTT CCATATCATC AAAAGACTGA AAAATTATAA TTTTAGAATT 360 

AAACTGATGG ATTTCATTAT AGAATTATCT GTGAGTTGTG TAGACACAGT CTTAATGTTT 4 20 

CTGGTTATGA CAGATAAGTT TGCTCAAAAA ATGTGGATGA AGCCATTATT GTTATTATTG 4 80 
TTATTGCTTC TGTTCAGTTG TCTAAGTATC ATCCCTTCTG TGGCCCATCA CGCAGCAGAG 54 0 
TTGCCCTACA AATTTCATTT GGCAGCGCCA T AA C ATT CAT TTAAAAAGTT TATGAAAACA 600 
TTCATTTGAA AGTTCCATGC AGCTTTAGCA CAGAGTTGAC CAAACACTGG CGTAAGTTCA 660 
ATTTACACAG AATATTTGAA TTGAAACAAT AGAAATTTTT CTCATAATAT ATACCTATGT 72 0 
GAAACCAACT TATCTGCATA ATTAAATCTA ATACATATTT AAGCCAGTTT AAGTGCTTTG 780 
TGTTGATGCC ATGCTTATCA AATACATGCA CAAGCTAAAC ATAATTTGAA TGGGTCTATG 84 0 
AAGGAAAAAT AATGCTTAGA CTTTGGTGTA GGTTCTTCCT GTGTAGCCAT ATACCCAGGC 900 
TCTGCAGTAT CGAAGGATGC AAATGTTGAC ATAGATGGAA GCTCTTACCT ACCAAAGTGT 96 0 

TTAGGAAGGA TAAAGTTACA TTTGTCTTAA TTTCTAACAT TATCTTTGCT TTTATGTTTC 1020 

ATAAAAATTT GTCATTATTT ATGCTGGTGA AACGTATAAT CACATCCAAT TATTTGAACA 108 0 

CATGCAAAAT AATTTTTTAA ATTATGTTAT TGTTTAAATT TGACTTATGG GAGATCAGTC 114 0 

AAAAACTTAG AAGGTTTAAC ACTTCACTGA TTAATGGTGC TGAAAACACG TTACAATTAC 120 0 

CACATATCCT TGCTATAAGT TTTGAAGTTT CTTAGCAATT AAAGTTTTTT TATTCAGTGT 126 0 

GAACTGTCAG TATCTATTCT GGTGCTAAAT GTATGGTGCT AAATGAATTG TTAGTGTTGA 13 20 

TGGCTTTAGT AATGCTCCTT TTATTCATTG CTAAATTTAG TGTTATCCAT TTGATTCCTG 13 80 

ATTCAGAAAT ATCAATAAAA TCCTATGTTA AATTAATCTT TACCAAAAAC AGGCAAGTTA 14 4 0 

ACTCTGTTGT TTTAATTCAA CAGTCCAACA TTATTTAGGT GTTACAGAGT GTAAATATAT 15 00 

TTCTTTGGGA GTTATTTTCT TTTTAAAATC TTTTTATAGC TTGGCAATGT CCAAAGTCAA 156 0 

ATATCACCTA AACTGGTTAG ATTACTTCTA CAGCTAATAA TATTGCAGGC ACTGGCGCCC 16 20 

TCTGGTGGTT ATGAAGACAA ATTCTTAATG GCTACTTGAC CTACAGCAAA AGCCATTTCT 16 80 

GTACCATAAA AATTTGTTGT GCAATATTAG AATTATCATA TGTTTCCTAC ATCTGACAGC 174 0 

ACCTAAAATG TTTGATAATA TTAACATGTA TCTAAGAGGA AAAAAGAGTT AATATATTCT 1800 

GGCACCCACT TTCCTAGTAA TGTTTTCCAT GATTTTCCAG TTCTGAGGCA CTTATTAAAG 186 0 

TGCTTTTTTT TTTCTGAATT AATTAGGTAT TGGTAAAATA TATTTTTAAA TTTAGTTAGC 192 0 

TTTATAAACA CAATTAGAAT TACAATTAAT TAACAGAGGT ATAATTGTCT CACTTTCAGA 19 80 

AGTGATCATT TATTTTTATT TAGCACAGGT CATAAGAAAA ATATATAGAA AAATAATCAA 204 0 
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TTTCATATAT AAAAGGATTA TTTCTCCACC TTTAATTATT GGCCTATCAT TTGTTAGTGT 2100 

TATTTGGTCA TATTATTGAA CTAATGTATT ATTCCATTCA AAGTCTTTCT AGATTTAAAA 216 0 

ATGTATGCAA AAGCTTAGGA TTATATCATG TGTAACTATT ATAGATAACA TCCTAAACCT 2220 

TCAGTTTAGA TATATAATTG ACTGGGTGTA ATCTCTTTTG TAATCTGTTT TGACAGATTT 22 80 

CTTAAATTAT GTTAGCATAA TCAAGGAAGA TTTACCTTGA AGCACTTTCC AAATTGATAC 234 0 

TTTCAAACTT ATTTTAAAGC AGTAGAACCT TTTCTATGAA CTAAATCACA TGCAAAACTC 2400 

CAACCTGTAG TATACATAAA ATGGACTTAC TTATTCCTCT CACCTTCTCC AGTGCCTAGG 24 6 0 

AATATTCTTC TCTGAGCCCT AGGATTGATT CTATCACACA GAGCAACATT AATCTAAATG 2 52 0 

GTTTAGCTCC CTCTTTTTTC TCTAAAAACA ATCAGCTAAT AAAAAAAAAA TTTGAGGGCC 25 80 

TAAATTATTT CAATGGTTGT TTGAAATATT CAGTTCAGTT TGTACCTGTT AGCAGTCTTT 264 0 

CAGTTTGGGG GAGAATTAAA TACTGTGCTA AGCTGGTGCT TGGATACATA TT AC AG CAT C 2 70 0 

TTGTGTTTTA TTTGACAAAC AGAATTTTGG TGCCATAATA TTTTGAGAAT TAGAGAAGAT 276 0 

TGTGATGCAT ATATATAAAC ACTATTTTTA AAAAATATCT AAATATGTCT CACATATTTA 2 82 0 

TATAATCCTC AAATATACTG TACCATTTTA GATATTTTTT AAACAGATTA ATTTGGAGAA 2 8 80 

GTTTTATTCA TTACCTAATT CTGTGGCAAA AATGGTGCCT CTGATGTTGT GATATAGTAT 294 0 

TGTCAGTGTG TACATATATA AAACCTGTGT AAACCTCTGT CCTTATGAAC CATAACAAAT 3000 

GTAGCTTTTT AAAGTCCATT GTATTGTTTT TTCTTTCAAT AAAAGAGTAT AATTAATTGG 3 06 0 
TTGTTTTTGA 



Gene nl^^S^Onigen^e^^HL^BSS; Prooeset -cession «. AA428062; Protein Accession 
# n/afsignal sequLce: none; Predicted TM domains: 125-141. 154-170; PFAM domains t none ; 
Summary: A type III membrane protein, highly overexpressed in breast cancer and prostate 
cancer; unknown function. 

YFIFQAKANE SLLVSTQPAI FFTACNNGAR IAISYCNCQR QKWSGYKLFH KSSFKLSVLR 60 

FSCGKVSFKK KVIGIHIPHH RSSLWCXFFY MTSRKILIFS QYRFWGFHII KRLKNYNFRI 120_ 

KLMDFIIELS VSCVDTVLMF LVMTDKFAQK MWMKPLLLLL LLLLFSCLSI IPSVAHHAAE 180 
L P YKFHLAA P 

BFA1 DNA sequence (SEQ ID NO: 14) cacno*. H „rlpic 

Gene name: calsyntenin-2 ; Unigene number: Hs.7413; Probeset Accession #= R46025, Nucleic 

Acid Accession #r NM_022131; Coding sequence: 11-2878 (start and stop codons underlined) 

TGCTGCGAGG ATGCTGCCTG GGCGGCTGTG CTGGGTGCCG CTCCTGCTGG CGCTGGGCGT 6 0 

GGGGAGCGGC AGCGGCGGTG GCGGGGACAG CCGGCAGCGC CGCCTCCTCG CGGCTAAAGT 12 0 

CAATAAGCAC AAGCCATGGA TCGAGACTTC ATATCATGGA G T C AT AACTG AGAACAATGA 18 0 

CACAGTCATT TTGGACCCAC CACTGGTAGC C CTGG AT AAA GATGCACCGG TTCCTTTTGC 24 0 

AGGGGAAATC TGTGCGTTCA AGATCCATGG CCAGGAGCTG CCCTTTGAGG CTGTGGTGCT 300 

CAACAAGACA TCAGGAGAGG GCCGGCTCCG TGCCAAGAGC CCCATTGACT GTGAGTTGCA 360 

GAAGGAGTAC ACATTCATCA TCCAGGCCTA TGACTGTGGT GCTGGGCCCC ACGAGACAGC 42 0 

CTGGAAAAAG TCACACAAGG CCGTGGTCCA TATACAGGTG AAGGATGTCA ACGAGTTTGC 480 

TCCCACCTTC AAAGAGCCAG CCTACAAGGC TGTTGTGACG GAGGGCAAGA TCTATGACAG 54 0 

CATTCTGCAG GTGGAGGCCA TTGACGAGGA CTGCTCCCCA CAGTACAGCC AGATCTGCAA 600 

CTATGAAATC GTCACCACAG ATGTGCCTTT TGCCATCGAC AGAAATGGCA ACATCAGGAA 66 0 

CACTGAGAAG CTGAGCTATG ACAAACAACA CCAGTATGAG ATCCTGGTGA CCGCCTACGA 720 

CTGTGGACAG AAGCCCGCTG CTCAGGACAC CCTGGTGCAG GTGGATGTGA AG C C AGTTTG 780 

CAAGCCTGGC TGGCAAGACT GGACCAAGAG GATTGAGTAC CAGCCTGGCT CCGGGAGCAT 84 0 

GCCCCTGTTC CCCAGCATCC ACCTGGAGAC GTGCGATGGA GCCGTGTCTT CCCTCCAGAT 900 

CGTCACAGAG CTGCAGACTA ATTACATTGG GAAGGGTTGT GACCGGGAGA CCTACTCTGA 96 0 

GAAATCCCTT CAGAAGTTAT GTGGAGCCTC CTCTGGCATC ATTGACCTCT TGCCATCCCC 1020 

TAGCGCTGCC ACCAACTGGA CTGCAGGACT GCTGGTGGAC AGCAGTGAGA TGATCTTCAA 1080 

GTTTGACGGC AGG CAGGGTG CCAAAATCCC CGATGGGATT GTGCCCAAGA ACCTGACCGA 114 0 

TCAGTTCACC ATCACCATGT GGATGAAACA CGGCCCCAGC CCTGGTGTGA G AG CCGAGAA 12 00 

GGAAACCATC CTCTGCAACT CAGACAAAAC CGAAATGAAC CGGCATCACT ATGCCCTGTA 1260 

TGTGCACAAC TGCCGCCTCG TCTTTCT CTT GCGGAAGGAC TTCGACCAGG CTGACACCTT 1320 

TCGCCCCGCG GAGTTCCACT GGAAGCTGGA TCAGATTTGT GACAAAGAGT GGCACTACTA 13 80 

TGTCATCAAT GTGGAGTTTC CTGTGGTAAC CTTATACATG GATGGAGCAA CATATGAACC 144 0 

ATACCTGGTG ACCAACGACT GGCCCATTCA TCCATCTCAC AT AG CCATGC AACTCACAGT 1500 

CGGCGCTTGT TGGCAAGGAG GAGAAGTCAC CAAACCACAG TTTGCTCAGT T CTTTCATGG 1560 

AAGCCTGGCC AGTCTCACCA TCCGCCCTGG CAAAATGGAA AGC CAGAAGG TGATCTCCTG 1620 

CCTGCAGGCC TGCAAGGAAG GGCTGGACAT TAATTCCTTG GAAAGCCTTG GCCAAGGAAT 16 80 

AAAGTATCAC TTCAACCCCT CGCAGTCCAT CCTGGTGATG GAAGGTGACG ACATTGGGAA 174 0 

CATTAACCGT GCTCTCCAGA AAGTCTCCTA CATCAACTCC AGGCAGTTCC CAACGGCGGG 1800 

TGTGCGGCGC CTCAAAGTAT CCTCCAAAGT CCAGTGCTTT GGGGAAGACG TATGCATCAG 1860 

TATCCCTGAG GTAGATGCCT ATGTGATGGT CCTCCAGGCC ATCGAGCCCC GGATCACCCT 1920 

C CGGGG C AC A G AC CACTTCT GGAGACCTGC TGCCCAGTTT GAAAGTGCCA GGGGAGTGAC I960 

CCTCTTCCCT GATATCAAGA TTGTGAGCAC CTTCGCCAAA ACCGAAGCCC CCGGGGACGT 204 0 

GAAAACCACA GACCCCAAAT CAGAAGTCTT AGAGGAAATG CTTCATAACT TAGATTTCTG 2100 

TGACATTTTG GTGATCGGAG GGGACTTGGA CCCAAGGCAG GAGTGCTTGG AGCTCAACCA 2160 

CAGTGAGCTC CACCAACGAC ACCTGGATGC CACTAATTCT ACTG C AGGCT ACTCCATCTA 22 20 

CGGTGTGGGC TCCATGAGCC GCTATGAGCA GGTGCTACAT CACATCCGCT ACCGCAACTG 22 80 

GCGTCCGGCT TCCCTTGAGG CCCGGCGTTT CCGGATTAAG TGCTCAGAAC TCAATGGGCG 234 0 

CTACACTAGC AATGAGTTCA ACTTGGAGGT CAG CATC CTT CATGAAGACC AAGTCTCAGA 24 00 

T AAGG AG CAT GTCAATCATC TGATTGTGCA GCCTCCCTTC CTCCAGTCTG TCCATCATCC 2460 

TGAGTCCCGG AGTAGCATCC AGCACAGTTC AGTGGTCCCA AGCATTGCCA CAGTGGTCAT 2 52 0 
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CATCATCTCC GTGTGCATGC TTG TGTTTGT CGTGGCCATG GGTGTGTACC GGGTCCGGAT 2580 

CGCCCACCAG CACTTCATCC AGGAGACTGA GGCTGCCAAG GAATCTGAGA TGGACTGGGA 264 0 

CGATTCTGCG CTGACTATCA CAGTCAACCC CATGGAGAAA CATGAAGGAC CAGGGCATGG 2700 

GGAAGATGAG ACTGAGGGAG AAGAGGAGGA AGAAGCCGAG GAAGAAATGA GCTCCAGCAG 276 0 

TGGCTCTGAC GACAGCGAAG AGGAGGAGGA GGAGGAAGGG ATGGGCAGAG G CAG ACATGG 2 8 20 

GCAGAATGGA GCCAGGCAAG CCCAGCTGGA GTGGGATGAC TCCACCCTCC CCTACTAGTG 2 880 

CCCAGGGGTC TGCTGCCTGG CCCACATGTC CCTTTTGTAA ACCCTGACCC AGTGTATGCC 2 94 0 

CATGTCTATC ATACCTCACC TCTGATGTCT GTGACATGTC TGGGAAGGCC TTCTCCAGCT 3000 

TCCTGGAGCC CACCCTTTAA GCCTTGGGCA CTCCCTGTGT TTCATCCATG GGGAAGTTCC 3 06 0 

AAGAAGCCCA GCATGGCCAT CAGTGAGGAC TTCAGGGTAG ACTTTGTCCT GTAGCCTCCA 3120 

CTTCTGCCCT AAGTTCCCCA GCATCCTGAC TACCTGTCTG CAGAGTTTGC CTTTGTTTTT 318 0 

TCCTGCAGGG AAGAAGGCCC ACCTTTGTGT CACTCACCTC CCCAGGCTCA GAGTCCCCAA 3 24 0 

GGCCCTGGGG TTCCAACTCA CTGTGCGTCT CCTCCACACA GACCAGTAGG TTCTCCTATG 3 3 00 

CTGACTCCAG GTTGCTTCAT ACAAGGAGGG TGGTTGAACT TCACACACGT AAGGTCTTAG 33 60 

TGCTTAACAG TTTAAAGGAA AGTCCTTGTT GAGGCAGAAC TAAGTTTACA GGGAAAGGTA 34 2 0 

CACACATTCT CTCTCTCTCT CTCTCTCTGT CTATCTAGTT CCCCAGCTTG GAGAGCCTTT 34 80 

CCCCTTGCTT CTTTCTGAGG CCATATAAGC TTATAAGAAA AGTCCCAAAC CAAGAATAGG 3 54 0 

TCCTTGGCCA CAAGCAGGGT CTGATCCCCC ATCAGAGCTA TCTGAGCCTG CCTGTCTGGG 3600 

CACCTGCTGC AACCATGCAG CTACCCTGCC AGGGGCACTC AG CAAACAG A ACCACAGGGC 3660 

CCAGGAGGCA TTCCACACAG GCACTGCCCC AGGACAACAC AACAAGGACA GTCACAACAA 3 720 

GGACAACAAG GACACAACAC AACACACAAC AAGGACAGTC ACAACAAGCC TAG AG CC AG A 3 7 80 

AAGCAGATGG AAATGCTAAT GAGGTCAAAC GTAGGCTTCA TGGTGGGTGG AGTGGGGGTG 3 84 0 

GCTGGGCTCC CCCAGGACAG AGGGGACCCT GAGGTTGGCA AGGCTCTCAC CACTCAGCCT 3 900 

TATGGTCCCT TATCTCCTAT CTTCCCTCTT GAGAAAATAC ACGCTTTCTG CATGTATTAG 3 960 

AAACGCACGA GCTCCACCAA GTCTACAATG AAAGTTTGAA ATTTAACTGC AAGGAATTAG 4 020 

AAGCATATTT GCAATCATTG CAGCTTCTTC TTTCTTCTGC TCATAAAAGG AGGAACACTT 4 080 

TAGATAGAGG GCAAATATAT CTGAAAACCT AATTTCTTTC TTTTTTTGAT AAGGAAATCT 4140 _ 

TTTCCATCTC CATCCTAACA TGCACAACCT GTGAAGAGAA TTGTTTCTAT AGTAACTGGT 4 2 00 

CTGTGATCTT TTGTGGCCAA GAGAATAGCA GGCAAGAATT AGGGCCTTGA CAGAATTTCC 4 260 

ACGAAGCTCT GAGAACATGT TTGTTTCGAA TGTCTGATTC CTCTTTGTCA TCAATGTGTA 4320 

TGCTCTGTCC CCATCCTTCA CTCCTCCTCA AGCTCACACC AATTGGTTTG GCACAGGCAC 4 380 
AGAGCTGGTC CCTAGTTAAG TGGCATTTAT GTTAAAAAAA A 



BFA1 Protein sequence (SEQ ID NO: 15) 

Gene name: calsyntenin-2 ; Unigene number: Hs.7413; Probeset Accession #: R46025; Protein 
Accession #: NP_071414; Predicted Signal sequence: 1-20; Predicted TM domains: 832-848; 
PFAM domains: cadherin_domains : 48-151, 165-254; Summary: A type I membrane protein; a 
member of the calsyntenin family; is related to the FAT tumor suppressor; is likely an 
adhesion molecule important in mammalian developmental processes and cell communication. 

MLPGRLCWVP LLLALGVGSG SGGGGDSRQR RLLAAKVNKH KPWIETSYHG VITENNDTVI 60 

LD P PLVALiDK DAP VP FAG E I CAFKIHGQEL PFEAWLNKT SGEGRLRAKS PIDCELQKEY 12 0 

TFIIQAYDCG AGPHETAWKK SHKAWHIQV KDVNEFAPTF KE PAYKAWT EGKIYDSILQ 180 

VEAIDEDCSP QYSQICNYEI VTTDVPFAID RNGNIRNTEK LSYDKQHQYE ILVTAYDCGQ 24 0 

KPAAQDTLVQ VDVKPVCKPG WQDWTKRIEY QPGSGSMPLF PSIHLETCDG AVSSLQIVTE 300 

I/QTNY I G KGC DRETYS EKS L QKLCGASSGI IDLLPSPSAA TNWTAGLLVD SSEMIFKFDG 360 

RQGAKIPDGI VPKNLTDQFT I TMWMKHG PS PGVRAEKETI LCNSDKTEMN RHHYALYVHN 420 

CRLVFLLRKD FDQADTFRPA EFHWKLDQIC DKEWHYYVIN VEFPWTLYM DGATYEPYLV 480 

TNDWPIHPSH IAMQLTVGAC WQGGEVTKPQ FAQFFHGSIA SLTIRPGKME SQKVISCLQA 54 0 

CKEGLDINSL ESLGOGIKYH FNPSQSILVM EGDDIGNINR AJJQKVSYINS RQFPTAGVRR 600 

LKVSSKVQCF GEDVCISIPE VDAYVMVLQA IEPRITLRGT DHFWRPAAQF ESARGVTLFP 660 

DIKIVSTFAK TEAPGDVKTT DPKSEVLEEM LHNLDFCDIL VIGGDLDPRQ ECLELNHSEL 72 0 

HQRHLDATNS TAGYS I YGVG SMSRYEQVLH H I RYRNWRPA SLEARRFRIK CSELNGRYTS 78 0 

NEFNLEVSIL HEDQVSDKEH VNHLIVQPPF LQSVHHPESR SSIQHSSWP SIATWIIIS 84 0 

VCMLVFWAM GVYRVRIAHQ HFIQETEAAK ESEMDWDDSA LTITVNPMEK HEGPGHGEDE 900 
TEGEEEEEAE EEMSSSSGSD DSEEEEEEEG MGRGRHGQNG ARQAQLEWDD STLPY 



BFG7 DNA sequence (SEQ ID NO: 16) . . 

Gene name: EST; Unigene number: Hs. 91668; Probeset Accession #: Z40 80 5; Nucleic Acid 
Accession #: n/a; Coding sequence: <l-906 (stop codon underlined) 

CGGGTCGACC CACGCGTCCG GGGAGAAAGG ATGGCCGGCC TGGCGGCGCG GTTGGTCCTG 6 0 

CTAGCTGGGG CAGCGGCGCT GGCGAGCGGC TCCCAGGGCG ACCGTGAGCC GGTGTACCGC 12 0 

GACTGCGTAC TGCAGTGCGA AGAGCAGAAC TGCTCTGGGG GCGCTCTGAA TCACTTCCGC 18 0 

TCCCGCCAGC CAATCTACAT GAGTCTAGCA GGCTGGACCT GTCGGGACGA CTGTAAGTAT 240 

GAGTGTATGT GGGTCACCGT TGGGCTCTAC CTCCAGGAAG GTCACAAAGT GCCTCAGTTC 3 00 

CATGGCAAGT GGCCCTTCTC CCGGTTCCTG TTCTTTCAAG AGCCGGCATC GGCCGTGGCC 36 0 

TCGTTTCTCA ATGGCCTGGC CAGCCTGGTG ATGCTCTGCC GCTACCGCAC CTTCGTGCCA 4 20 

GCCTCCTCCC CCATGTACCA CACCTGTGTG GCCTTCGCCT GGGTGTCCCT CAATGCATGG 4 80 

TTCTGGTCCA CAGTYTTCCA CACCAGGGAC ACTGACCTCA CAGAGAAAAT GG ACT ACTT C 54 0 

TGTGCCTCCA CTGTCATCCT ACACTCAATC TACCTGTGCT GCGTCAGCCT CATCCGCTTC 600 

GACTATGGCT ACAACCTGGT GGCCAACGTG GCTATTGGCC TGGTCAACGT GGTGTGGTGG 660 

CTGGCCTGGT GCCTGTGGAA CCAGCGGCGG CTGCCTCACG TGCGCAAGTG CGTGGTGGTG 72 0 

GTCTTGCTGC TGCAGGGGCT GTCCCTGCTC GAGCTGCTTG ACTTCCCACC GCTCTTCTGG 780 

GTCCTGGATG CCCATGCCAT CTGGCACATC AGCACCATCC CTGTCCACGT CCTCTTTTTC 84 0 

AGCTTTCTGG AAGATGACAG CCTGTACCTG CTGAAGGAAT CAGAGGACAA GTTCAAGCTG 900 
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GACTGAAGAC CTTGGAGCGA GTCTGCCCCA GTGGGGATCC TGCCCCCGCC CTGCTGGCCT 960 

CCCTTCTCCC CTCAACCCTT GAGATGATTT TCTCTTTTCA ACTTCTTGAA CTTGGACATG 102 0 

AAGGATGTGG GCCCAGAATC ATGTGGCCAG CCCACCCCCT GTTGGCCCTC ACCAGCCTTG 10 80 

GAGTCTGTTC TAGGGAAGGC CTCCCAGCAT CTGGGACTCG AGAGTGGGCA GCCCCTCTAC 114 0 

CTCCTGGAGC TGAACTGGGG TGGAACTGAG TGTGCTCTTA GCTCTACCGG GAGGACAGCT 1200 

GCCTGTTTCC TCCCCATCAG CCTCCTCCCC ACATCCCCAG CTGCCTGGCT GGGTCCTGAA 1260 

GCCCTCTGTC TACCTGGGAG ACCAGGGACC ACAGGCCTTA GGGATACAGG GGGTCCCCTT 13 2 0 

CTGTTACCAC CCCCCACCCT CCTCCAGGAC ACCACTAGGT GGTGCTGGAT GCTTGTTCTT 13 80 

TGGCCAGCCA AGGTTCACGG CGATTCTCCC CATGGGATCT TGAGGGACCA AGCTGCTGGG 144 0 

ATTGGGAAGG AGTTTCACCC TGACCRTTGC CCTAGCCAGG TTCCCAGGAG GCCTCACCAT 1500 

ACTCCCTTTC AGGGCCAGGG CTCCAGCAAG CCCAGGGCAA GGATCCTGTG CTGCTGTCTG 1560 

GTTGAGAGCC TGCCACCGTG TGTCGGGAGT GTGGGCCAGG CTGAGTGCAT AGGTGACAGG 162 0 

GCCGTGAGCA TGGGCCTGGG TGTGTGTGAG CTCAGGCACT AGGTGCGCAG TGTGGAGACG 16 80 

GGTGTTGTCG GGGAAGAGGT GTGGCTTCAA AGTGTTGTGT GTGCAGGGGG TKGGTGTGTT 174 0 

AAGCGTGGGT TAGGGGAACG TGTGTGCGCG TGCTGGTGGG CATGTGAGAT GAGTGACTGC 18 00 

CGGTGAATGT GTCCACAGTT GAGAGGTTGG AGCAGGATGA GGGAATCCTG TCACCATCAA 186 0 

TAATCACTTG TGGAGCGCCA CTTGGCCCAA GACGCCACCT GGGCGGACAG CAGGAGCTCT 1920 

CCATGGCCAG GCTGCCTGTG TGCATGTTCC CTGTCTGGTG CCCCTTTGCC CGCCTCCTGC 19 8 0 

AAACCTCACA GGGTCCCCAC ACAACAGTGC CCTCCAGAAG CAGCCCCTCG GAGGCAGAGG 204 0 

AAGGAAAATG GGGATGGCTG GGGCTCTCTC CATCCTCCTT TTCTCCTTGC CTTCGCATGG 2100 

CTGGCCTTCC CCTCCAAAAC CTCCATTCCC CTGCTGCCAG CCCCTTTGCC ATAGCCTGAT 216 0 

TTTGGGGAGG AGGAAGGGGC GATTTGAGGG AGAAGGGGAG AAAGCTTATG GCTGGGTCTG 222 0 

GTTTCTTCCC TTCCCAGAGG GTCTTACTGT TCCAGGGTGG CCCCAGGGCA GGCAGGGGCC 22 80 

AC ACT ATG CC TGCGCCCTGG TAAAGGTGAC CCCTGCCATT TACCAGCAGC CCTGGCATGT 234 0 

TCCTGCCCCA CAGGAATAGA ATGGAGGGAG CTCCAGAAAC TTTCCATCCC AAAGGCAGTC 24 00 

TCCGTGGTTG AAGCAGACTG G ATTTTTG CT CTGCCCCTGA CCCCTTGTCC CTCTTTGAGG 24 6 0 

GAGGGGAGCT ATGCTAGGAC TCCAACCTCA GGGACTCGGG TGGCCTGCGC TAG CTTCTTT 2 520_ 

TGATACTGAA AACTTTTAAG GTGGGAGGGT GGCAAGGGAT GTGCTTAATA AATCAATTCC 25 80 

AAGCCTCAAA AAAAAAAAAA AAAAAAAAAA AAAAAA 

BFG7 Protein sequence (SEQ ID NO: 17) , 

Gene name: EST; Unigene number: Hs. 91668; Probeset Accession #: Z4 08 05; Protein Accession 
#: n/a; Signal sequence: none; Predicted TM domains: 117-133, 179-195, 211-227, 235-251, 
266-282, 296-312; PFAM domains: none; Summary: A type III membrane protein of unknown 
function; is adjacent to HER2 on the genome, and its overexpression in breast cancer is highly 
correlated with HER2 expression; may be used to predict HER2 overexpression and amplification. 

RVDPRVRGER MAGLAARLVL LAGAAALASG SQGDREPVYR DCVLQCEEQN CSGGALNHFR 6 0 

SRQPIYMS1A GWTCRDDCKY ECMWVTVGLY LQEGHKVPQF HGKWPFSRFL FFQEPASAVA 120 

SFLNGLASLV MLCRYRTFVP ASSPMYHTCV AFAWVSLNAW FVJSTVFHTRD TDLTEKMDYF 180 

CASTVILHSI YLCCVRTVGL QHPAWSAFR ALLLLMLTVH VSYLSLIRFD YGYNLVANVA 24 0 

IGLVNWWWL AWCLWNQRRL PHVRKCWW LLLQGLSL.LE LLDFPPLFWV LDAHAIWHIS 3 00 
TIPVHVLFFS FLEDDSLYLL KESEDKFKLD 

BCN4 DNA sequence (SEQ ID NO: 18) , 

Gene name: ESTs; Unigene number: Hs. 283713; Probeset Accession #: F13673; Nucleic Acid 

Accession #: n/a ; Coding sequence: 143-874 (start and stop codons underlined) 

GGGAGGGAGA GAGGCGCGCG GGTGAAAGGC GCATTGATGC AGCCTGCGGC GGCCTCGGAG 60 

CGCGGCGGAG CCAGACG CTG ACCACGTTCC TCTCCTCGGT CTCCTCCGCC TCCAGCTCCG 120 

CGCTGCCCGG CAGCCGGGAG CCATGCGACC CCAGGGCCCC GCCGCCTCCC CGCAGCGGCT 180 

CCGCGGCCTC CTGCTGCTCC TGCTGCTGCA GCTGCCCGCG CCGTCGAGCG CCTCTGAGAT 24 0 

CCCCAAGGGG AAGCAAAAGG CGCAGCTCCG GCAGAGGGAG GTGGTGGACC TGTATAATGG 3 00 

AATGTGCTTA CAAGGGCCAG CAGGAGTGCC TGGTCGAGAC GGGAGCCCTG GGGCCAATGG 360 

CATTCCGGGT ACACCTGGGA TCCCAGGTCG GGATGGATTC AAAGGAGAAA AGGGGGAATG 420 

TCTGAGGGAA AGCTTTGAGG AGTCCTGGAC ACCCAACTAC AAGCAGTGTT CATGGAGTTC 4 80 

ATTGAATTAT GGCATAGATC TTGGGAAAAT TGCGGAGTGT ACATTTACAA AGATGCGTTC 54 0 

AAATAGTGCT CTAAGAGTTT TGTTCAGTGG CTCACTTCGG CT AAAATG C A GAAATGCATG 600 

CTGTCAGCGT TGGTATTTCA CATTCAATGG AGCTGAATGT TCAGGACCTC TTCCCATTGA 660 

AGCTATAATT TATTTGGACC AAGGAAGCCC TGAAATGAAT TCAACAATTA ATATTCATCG 72 0 

CACTTCTTCT GTGGAAGGAC TTTGTGAAGG AATTGGTGCT GGATTAGTGG ATGTTGCTAT 78 0 

CTGGGTTGGC ACTTGTTCAG ATTACCCAAA AGGAGATGCT TCTACTGGAT GGAATTCAGT 84 0 

TTCTCGCATC ATTATTGAAG AACTACCAAA ATAAATGCTT TAATTTTCAT TTGCT AC CTC 900 

TTTTTTTATT ATGCCTTGGA ATGGTTCACT TAAATGACAT TTTAAATAAG TTTATGTATA 96 0 

CATCTGAATG AAAAGCAAAG CTAAATATGT TTACAGACCA AAGTGTGATT TCACACTGTT 102 0 

TTTAAATCTA GCATTATTCA TTTTGCTTCA ATCAAAAGTG GTTTCAATAT TTTTTTTAGT 108 0 

TGGTTAGAAT ACTTTCTTCA TAGTCACATT CTCTCAACCT ATAATTTGGA ATATTGTTGT 114 0 

GGTCTTTTGT TTTTTCT CTT AGTATAGCAT TTTT AAAAAA AT ATAAAAG C TACCAATCTT 1200 

TGTACAATTT GTAAATGTTA AGAATTTTTT TTATATCTGT TAAATAAAAA TTATTTCCAA 12 60 
CAACCTTAAA AAAAAAAAAA AAAA 

BCN4 Protein sequence (SEQ ID NO: 19) 

Gene name: ESTs; Unigene number: Hs . 283713; Probeset Accession #: F13673; Protein Accession 

#: n/a ; Predicted Signal sequence: 1-30; TM domains: none; PFAM domains: none; Summary: a 

secreted protein; has a mouse orthologue (see sequence below) . 
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MRPQGPAASP QRLRGLLLLL LLQLPAPSSA SEIPKGKQKA QLRQREWDL YNGMCLQGPA 60 

GVPGRDGSPG ANGIPGTPGI PGRDGFKGEK GECLRESFEE SWTPNYKQCS WSSLNYGIDL 120 

GKIAECTFTK MRSNSALRVL FSGSLRLKCR NACCQRWYFT FNGAECSGPL PIEAIIYLDQ 18 0 

GSPEMNSTIN IHRTSSVEGL C EG I GAG LVD VAIWVGTCSD YPKGDASTGW NSVSRIIIEE 24 0 
LPK 

Mouse BCN4 Protein sequence (SEQ ID NO: 20) 
Gene name: ESTs; Unigene number: Mm. 41556 

XXXXAAPPQL LLGLFLVLLL LLQLSAPSSA SENPKVKQKA LIRQREWDL YNGMCLQGPA 60 

GVPGRDGSPG ANGIPGTPGI PCQDGFKGEK GECLRESFEE SWTPNYKQCS WSSLNYGIDL 120 

GKIAECTFTK MRSNSALRVL FSGSLRLKCR NACCQRWYFT FNGAECSGPP PIEAIXXXXX 180 

XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXSD YPKGDAYTGW DSVSRIIIEE 24 0 
LPK 
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MACK and GISH - PATENT 

Application No.: 09/829,472 
Page 4 

VERSION WITH MARKINGS TO SHOW CHANGES MADE 
In the Specification: 

Paragraph (Table 1) beginning at line 1 of page 94 has been amended as follows (see 
attached pages 94-103): 



SF 1281429 v1 



Table 1 

BCA4 DNA sequence (SEQ ID N0:1) fi d^h^c^i- 

Gene name: osteoblast specific factor 2 (periostin); Unigene number: Hs . 136348 ; Prob eset 
Accession #: D13666; Nucleic Acid Accession #: NM_006475; Coding sequence: 12-2522 (start 
and stop codons underlined) 

AGAGACTCAA GATGATTCCC TTTTTACCCA TGTTTTCTCT ACTATTGCTG CTTATTGTTA 6 0 

ACC CT AT AAA CGCCAACAAT CATTATGACA AGATCTTGGC TCATAGTCGT ATCAGGGGTC 120 
GGGACCAAGG CCCAAATGTC TGTGC CCTTC AACAGATTTT GGGCACCAAA AAGAAATACT 180 
TCAGCACTTG TAAGAACTGG TATAAAAAGT CCATCTGTGG ACAGAAAACG ACTGTTTTAT 24 0 
ATGAATGTTG CCCTGGTTAT ATGAGAATGG AAGGAATGAA AGGCTGCCCA GCAGTTTTGC 300 
CCATTGACCA TGTTTATGGC ACTCTGGGCA TCGTGGGAGC CACCACAACG CAGCGCTATT 36 0 
CTGACGCCTC AAAACTGAGG GAGGAGATCG AGGGAAAGGG ATCCTTCACT TACTTTGCAC 4 20 
CGAGTAATGA GGCTTGGGAC AACTTGGATT CTGATATCCG TAGAGGTTTG GAGAGCAACG 4 80 
TGAATGTTGA ATTACTGAAT GCTTTACATA GTCACATGAT TAATAAGAGA ATGTTGACCA 54 0 
AGGACTTAAA AAATGGCATG ATTATTCCTT CAATGTATAA CAATTTGGGG CTTTTCATTA 600 
AC C ATTATCC TAATGGGGTT GTCACTGTTA ATTGTGCTCG AATCATCCAT GGGAAC CAG A 66 0 
TTGCAACAAA TGGTGTTGTC CATGTCATTG ACCGTGTGCT TACACAAATT GGTACCTCAA 72 0 
TTCAAGACTT CATTGAAGCA GAAGATGACC TTTCATCTTT TAGAGCAGCT GCCATCACAT 78 0 
CGGACATATT GGAGGCCCTT GGAAGAGACG GTCACTTCAC ACTCTTTGCT CCCACCAATG 84 0 
AGGCTTTTGA GAAACTTCCA CGAGGTGTCC TAGAAAGGTT CATGGGAGAC AAAGTGGCTT 900 
CCGAAGCTCT TATGAAGTAC CACATCTTAA ATACTCTCCA GTGTTCTGAG TCTATTATGG 96 0 
GAGGAGCAGT CTTTGAGACG CTGGAAGGAA ATACAATTGA GATAGGATGT GACGGTGACA 1020 
GTATAACAGT AAATGGAATC AAAATGGTGA ACAAAAAGGA TATTGTGACA AATAATGGTG 1080 
TGATCCATTT GATTGATCAG GTCCTAATTC CTGATTCTGC CAAACAAGTT ATTGAGCTGG 114 0 
CTGGAAAACA GCAAACCACC TTCACGGATC TTGTGGCCCA ATTAGGCTTG GCATCTGCTC .1200- 
TGAGGCCAGA TGGAGAATAC ACTTTGCTGG CACCTGTGAA TAATGCATTT TCTGATGATA 1260 
CTCTCAGCAT GGTTCAGCGC CTCCTTAAAT TAATTCTGCA GAATCACATA TTGAAAGTAA 132 0 
AAGTTGGCCT TAATGAGCTT TACAACGGGC AAATACTGGA AACCATCGGA GGCAAACAGC 13 80 
TCAGAGTCTT CGTATATCGT ACAGCTGTCT GCATTGAAAA TTCATGCATG GAGAAAGGGA 144 0 
GTAAGCAAGG GAGAAACGGT GCGATTCACA TATTCCGCGA GATCATCAAG C C AGCAG AG A 1500 
AATCCCTCCA TGAAAAGTTA AAACAAGATA AGCGCTTTAG CACCTTCCTC AGCCTACTTG 156 0 
AAGCTGCAGA CTTGAAAGAG CTCCTGACAC AACCTGGAGA CTGGACATTA TTTGTGCCAA 16 2 0 
CCAATGATGC TTTTAAGGGA ATGACTAGTG AAGAAAAAGA AATTCTGATA CGGGACAAAA 1680 
ATGCTCTTCA AAACATCATT CTTTATCACC TGACACCAGG AGTTTT CATT GGAAAAGGAT 174 0 
TTGAACCTGG TGTTACTAAC ATTTTAAAGA CCACACAAGG AAGCAAAATC TTTCTGAAAG 1800 
AAGTAAATGA TACACTTCTG GTGAATGAAT TGAAATCAAA AGAATCTGAC ATCATGACAA 186 0 
CAAATGGTGT AATTCATGTT GTAGATAAAC TCCTCTATCC AGCAGACACA CCTGTTGGAA 192 0 
ATGATCAACT GCTGGAAATA CTTAATAAAT TAAT CAAAT A CATCCAAATT AAGTTTGTTC 1980 
GTGGTAGCAC CTTCAAAGAA ATCCCCGTGA CTGTCTATAC AACTAAAATT ATAAC CAAAG 2 04 0 
TTGTGGAACC AAAAATTAAA GTGATTGAAG GCAGTCTTCA GCCTATTATC AAAACTGAAG 2100 
GACCCACACT AACAAAAGTC AAAATTGAAG GTGAACCTGA ATTCAGACTG ATTAAAGAAG 2160 
GTGAAACAAT AACTGAAGTG ATCCATGGAG AGCCAATTAT TAAAAAATAC ACCAAAATCA 2220 
TTGATGGAGT GCCTGTGGAA ATAACTGAAA AAGAGACACG AGAAGAACGA AT CATT ACAG 22 80 
GTCCTGAAAT AAAATACACT AGGATTTCTA CTGGAGGTGG AGAAACAGAA GAAACTCTGA 234 0 
AGAAATTGTT ACAAGAAGAG GTCACCAAGG TCACCAAATT CATTGAAGGT GGTGATGGTC 24 00 
ATTTATTTGA AGATGAAGAA ATTAAAAGAC TGCTTCAGGG AGACACACCC GTGAGGAAGT 24 60 
TGCAAGCCAA CAAAAAAGTT CAAGGTTCTA GAAGACGATT AAGGGAAGGT CGTTCTCAGT 2520 
GAAAATCCAA AAACCAGAAA AAAATGTTTA TACAACCCTA AGTCAATAAC CTGACCTTAG 258 0 
AAAATTGTGA GAGCCAAGTT GACTTCAGGA ACTGAAACAT CAGCACAAAG AAGCAATCAT 264 0 
CAAATAATTC TGAACACAAA TTTAATATTT TTTTTTCTGA ATGAGAAACA TGAGGGAAAT 2 7 00 
TGTGGAGTTA GCCTCCTGTG GTAAAGGAAT TGAAGAAAAT ATAACAC CTT ACACCCTTTT 2760 
TCATCTTGAC ATTAAAAGTT CTGGCTAACT TTGGAATCCA TTAGAGAAAA ATCCTTGTCA 2 820 
CCAGATTCAT TACAATTCAA ATCGAAGAGT TGTGAACTGT TATC CCATTG AAAAG AC CGA 2 880 
GCCTTGTATG TATGTTATGG ATACATAAAA TGCACGCAAG CCATTATCTC TCCATGGGAA 2940 
GCTAAGTTAT AAAAATAGGT G CTTGGTGT A CAAAACTTTT TATATCAAAA GGCTTTGCAC 3 000 
ATTTCTATAT GAGTGGGTTT ACTGGTAAAT TATGTTATTT TTTACAACTA ATTTTGTACT 3 060 
CTCAGAATGT TTGTCATATG CTTCTTGCAA TGCATATTTT TTAATCTCAA ACGTTTCAAT 3120 
AAAACCATTT TTCAGATATA AAGAGAATTA CTTCAAATTG AGTAATTCAG AAAAACTCAA 3180 
GATTTAAGTT AAAAAGTGGT TTGGACTTGG GAA 

BCA4 Protein sequence (SEQ ID NO: 2) n -, 4n . Dr - hpc;pt 

Gene name: osteoblast specific factor 2 (periostin); Unigene number: He 136348, Preset 
Accession #- D13666; Protein Accession #: NP_006466; Predicted Signal sequence: 1-21, TM 
domains none; PFAM domains: f asciclin.domains : 94-232 234-367, 496-630; Summary: a 
secreted protein involved in adhesion and osteoblast development; may participate in 
preferential metastasis of breast cancer to bone. 

MIPFLPMFSL LLLLIVNPIN ANNHYDKILA HSRIRGRDQG PNVCALQQIL GTKKKYFSTC 60 
KNWYKKSICG QKTTVLYECC PGYMRMEGMK GCPAVLPIDH VYGT LG I VGA TTTQRYSDAS 12 0 
KLREEIEGKG SFTYFAPSNE AWDNLDSDIR RGLESNVNVE LLNALHSHMI NKRMLTKDLK 180 
NGMIIPSMYN NLGLFINHYP NGWTVNCAR IIHGNQIATN GWHVTDRVL TQIGTSIQDF 240 
IEAEDDLSSF RAAAITSDIL EALGRDGHFT LFAPTNEAFE KLPRGVLERF MGDKVAS EAL> 300 
MKYHILNTLQ CSESIMGGAV FETLEGNT I E IGCDGDSITV NGIKMVNKKD I VTNNG V I HL 360 
IDQVLIPDSA KQVIELAGKQ QTTFTDLVAQ LGLASALiRPD GEYTLLAPVN NAFSDDTLSM 420 
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VQRLLKLILQ NHILKVKVGL NELYNGQILE TIGGKQLRVF VYRTAVC I EN SCMEKGSKQG 480 

RNGAIHIFRE IIKPAEKSLH EKLKQDKRFS TFLSLLEAAD LKELLTQPGD WTLFVPTNDA 54 0 

FKGMTSEEKE ILIRDKNALQ NIILYHLTPG VFIGKGFEPG VTNILKTTQG SKIFLKEVND 600 

TLLVNELKSK ESDIMTTNGV I HWDKLLY P ADT PVGNDQL LEILNKLIKY IQIKFVRGST 660 

FKEIPVTVYT TKIITKWEP KIKVIEGSLQ PIIKTEGPTI* TKVKIEGEPE FRLIKEGETI 720 

TEVIHGEPII KKYTKIIDGV PVEITEKETR EERIITGPEI KYTRISTGGG ETEETLKKLL 780 
QEEVTKVTKF IEGGDGHLFE DEEIKRLLQG DTPVRKLQAN KKVQGS RRRL REGRSQ 

BCA7 DNA sequence (SEQ ID NO: 3) 

Gene name: ST4 oncofetal trophoblast glycoprotein; Unigene number: Hs . 82128; Probeset 

Accession #: Z29083; Nucleic Acid Accession #: NM_006670; Coding sequence: 85-1347 (start 
and stop codons underlined) 

CCGGCTCGCG CCCTCCGGGC CCAGCCTCCC GAGCCTTCGG AGCGGGCGCC GTCCCAGCCC 6 0 

AGCTCCGGGG AAACGCGAGC CGCGATGCCT GGGGGGTGCT CCCGGGGCCC CGCCGCCGGG 12 0 

GACGGGCGTC TGCGGCTGGC GCGACTAGCG CTGGTACTCC TGGGCTGGGT CTCCTCGTCT 180 

TCTCCCACCT CCTCGGCATC CTCCTTCTCC TCCTCGGCGC CGTTCCTGGC TTCCGCCGTG 24 0 

TCCGCCCAGC CCCCGCTGCC GGACCAGTGC CCCGCGCTGT GCGAGTGCTC CGAGGCAGCG 3 00 

CGCACAGTCA AGTGCGTTAA CCGCAATCTG ACCGAGGTGC CCACGGACCT GCCCGCCTAC 360 

GTGCGCAACC TCTTCCTTAC CGGCAACCAG CTGGCCGTGC TCCCTGCCGG CGCCTTCGCC 420 

CGCCGGCCGC CGCTGGCGGA GCTGGCCGCG CTCAACCTCA GCGGCAGCCG CCTGGACGAG 4 80 

GTGCGCGCGG GCGCCTTCGA GCATCTGCCC AGCCTGCGCC AGCTCGACCT CAGCCACAAC 54 0 

CCACTGGCCG ACCTCAGTCC CTTCGCTTTC TCGGGCAGCA ATGCCAGCGT CTCGGCCCCC 600 

AGTCCCCTTG TGGAACTGAT CCTGAACCAC ATCGTGCCCC CTGAAGATGA GCGGCAGAAC 6 60 

CGGAGCTTCG AGGGCATGGT GGTGGCGGCC CTGCTGGCGG GCCGTGCACT GCAGGGGCTC 7 20 

CGC CGCTTGG AGCTGGCCAG CAACCACTTC CTTTACCTGC CGCGGGATGT GCTGGCCCAA 7 80 

CTGCCCAGCC TCAGGCACCT GGACTTAAGT AATAATTCGC TGGTGAGCCT GACCTACGTG 84 0 

TCCTTCCGCA ACCTGACACA TCTAGAAAGC CTCCACCTGG AGGACAATGC CCTCAAGGTC 900 

CTTCACAATG GCACCCTGGC TGAGTTGCAA GGTCTACCCC ACATTAGGGT TTTCCTGGAC 960 

AACAATCCCT GGGTCTGCGA CTGCCACATG GCAGACATGG TG AC CTGGCT CAAGGAAACA 102 0 

GAGGTAGTGC AGGGCAAAGA CCGGCTCACC TGTGCATATC CGGAAAAAAT GAGGAATCGG 1080 

GTCCTCTTGG AACTCAACAG TGCTGACCTG GACTGTGACC CGATTCTTCC CCCATCCCTG 114 0 

CAAACCTCTT ATGTCTTCCT GGGTATTGTT TTAGCCCTGA TAGGCG CTAT TTTCCTCCTG 1200 

GTTTTGTATT TGAACCGCAA GGGGATAAAA AAGTGGATGC ATAACATCAG AGATGCCTGC 1260 

AGGGATCACA TGGAAGGGTA TCATTACAGA TATGAAATCA ATGCGGACCC CAGATTAACA 13 2 0 

AACCTCAGTT CTAACTCGGA TGTCTGAGAA ATATTAGAGG ACAGACCAAG GACAACTCTG 1380 

CATGAGATGT AGACTTAAGC TTTATCCCTA CTAGGCTTGC TCCACTTTCA TCCTCCACTA 144 0 

TAGATACAAC GGACTTTGAC TAAAAGCAGT GAAGGGGATT TGCTTCCTTG TTATGTAAAG 1500 

TTTCTCGGTG TGTTCTGTTA ATGTAAGACG ATGAACAGTT GTGTATAGTG TTTTACCCTC 156 0 

TTCTTTTTCT TGGAACTCCT CAACACGTAT GGAGGGATTT TTCAGGTTTC AGCATGAACA 16 20 

TGGGCTT CTT GCTGTCTGTC TCTCTCTCAG TACAGTTCAA GGTGTAGCAA GTGTACCCAC 16 8 0 

ACAGATAGCA TTCAACAAAA GCTGCCTCAA CTTTTTCGAG AAAAATACTT TATTCATAAA 1740 

TATCAGTTTT ATTCTCATGT ACCTAAGTTG TGGAGAAAAT AATTGCATCC TATAAACTGC 18 00 

CTGCAGACGT TAG CAGGCTC TTCAAAATAA CTCCATGGTG CACAGGAGCA CCTGCATCCA 186 0 

AGAGCATGCT TACATTTTAC TGTTCTGCAT ATTACAAAAA ATAACTTGCA ACTTCATAAC 1920 

TTCTTTGACA AAGTAAATTA CTTTTTTGAT TGCAGTTTAT ATGAAAATGT ACTGATTTTT 198 0 

TTTTAATAAA CTGCATCGAG ATCCAACCGA CTGAATTGTT AAAAAAAAAA AAAAATAAAG 2 04 0 
ATTCTTAAAA GAA 

i 

BCA7 Protein sequence (SEQ ID NO:4) 0 -,oq n.nKoc^ 
Gene name: 5T4 oncofetal trophoblast glycoprotein; Unigene number: Hs. 82128; Probeset 
Accession #: Z29083; Protein Accession ft: NP_006661; Predicted S ignal sequence : l" 3 ^ 
Predicted TM domains: 357-373; PFAM domains : leucine_rich_repeats : 61-90, 119-142, 143-166, 
2 3 5 - 2 58 259—282 294-345* 

Summary': a type' la TM protein of unknown function, detected in multiple cancers, with highest 
expression in breast cancer. 

MPGGCSRGPA AGDGRLRLAR LALVLLGWVS SSSPTSSASS FSSSAPFLAS AVSAQPPLPD 6 0 

QCPALCECSE AARTVKCVNR NLTEVPTDLP AYVRNLFLTG NQLAVLPAGA FARRPPLAEL 120 

AALNLSGSRL DEVRAGAFEH LPSLRQLDLS HNPLADLSPF AFSGSNASVS APSPLVELIL 180 

NHIVPPEDER QNRSFEGMW AALLAGRALQ GLRRLELASN HFLYLPRDVL AQLPSLRHLD 24 0 

LSNNSLVSLT YVSFRNLTHL ESLHLEDNAL KVLHNGTLAE LQGLPHIRVF LDNNPWVCDC 300 

HMADMVTWLK ETEWQGKDR LTCAYPEKMR NRVLLELNSA DLDCDPILPP SLQTSYVFLG 360 
IVLALIGAIF LLVLYLNRKG IKKWMHNIRD ACRDHMEGYH YRYEINADPR LTNLS SNSDV 

BCX5 DNA sequence (SEQ I D NO: 5) ^ . , 

Gene name: LNIR; Unigene numbe r: Hs. 61460; Probeset Accession ft: AA028028; Nucleic Acid 
Accession ft: AF160477; Coding sequence: 225-1757 (start and stop codons underlined) 

GGGGAGCTCG GAGCTCCCGA TCACGGCTTC TTGGGGGTAG CTACGGCTGG GTGTGTAGAA 6 0 

CGGGGCCGGG GCTGGGGCTG GGTCCCCTAG TGAGACCCAA GTGCGAGAGG CAAGAACTCT 120 

GCAGCTTCCT GCCTTCTGGG TCAGTTCCTT ATTCAAGTCT GCAGCCGGCT CCCAGGGAGA 180 

TCTCGGTGGA ACTT CAGAAA CGCTGGGCAG TCTGCCTTTC AACCATGCCC CTGTCCCTGG 24 0 

GAGCCGAGAT GTGGGGGCCT GAGGCCTGGC TGCTGCTGCT GCTACTGCTG GC AT CATTT A 3 00 

CAGGCCGGTG CCCCGCGGGT GAGCTGGAGA CCTCAGACGT GGTAACTGTG GTGCTGGGCC 3 60 

AGGACGCAAA ACTGCCCTGC TTCTACCGAG GGGACTCCGG CGAGCAAGTG GGGCAAGTGG 420 
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CATGGGCTCG GGTGGACGCG GGCGAAGGCG CCCAGGAACT AGCGCTACTG CACTCCAAAT 
ACGGGCTTCA TGTGAGCCCG GCTTACGAGG GCCGCGTGGA GCAGCCGCCG CCCCCACGCA 54 0 
ACCCCCTGGA CGGCTCAGTG CTCCTGCGCA ACG CAGTG CA GGCGGATGAG GGCGAGTACG 600 
AGTGCCGGGT CAGCACCTTC CCCGCCGGCA GCTTCCAGGC GCGGCTGCGG CTCCGAGTGA 660 
TGGTGCCTCC CCTGCCCTCA CTGAATCCTG GTCCAGCACT AGAAGAGGGC CAGGGCCTGA 720 
CCCTGGCAGC CTCCTGCACA GCTGAGGGCA GCCCAGCCCC CAGCGTGACC TGGGACACGG 780 
AGGTCAAAGG CACAACGTCC AGCCGTTCCT TCAAGCACTC CCGCTCTGCT GCCGTCACCT 84 0 
CAGAGTTCCA CTTGGTGCCT AGCCGCAGCA TGAATGGGCA GCCACTGACT TGTGTGGTGT 900 
CCCATCCTGG CCTGCTCCAG GACCAAAGGA TCACCCACAT CCTCCACGTG TCCTTCCTTG 960 
CTGAGGCCTC TGTGAGGGGC CTTGAAGACC AAAATCTGTG GCACATTGGC AGAGAAGGAG 102 0 
CTATGCTCAA GTGCCTGAGT GAAGGGCAGC CCCCTCCCTC ATACAACTGG ACACGGCTGG 1080 
ATGGGCCTCT GCCCAGTGGG GTACGAGTGG ATGGGGACAC TTTGGGCTTT CCCCCACTGA 114 0 
CCACTGAGCA CAGCGGCATC TACGTCTGCC ATGTCAGCAA TGAGTTCTCC TCAAGGGATT 12 00 
CTCAGGTCAC TGTGGATGTT CTTGACCCCC AGGAAGACTC TGGGAAGCAG GTGGACCTAG 126 0 
TGTCAGCCTC GGTGGTGGTG GTGGGTGTGA TCGCCGCACT CTTGTTCTGC CTTCTGGTGG 13 2 0 
TGGTGGTGGT GCTCATGTCC CGATACCATC GGCGCAAGGC CCAGCAGATG ACCCAGAAAT 13 80 
ATGAGGAGGA GCTGACCCTG ACCAGGGAGA ACTCCATCCG GAGGCTGCAT TCCCATCACA 144 0 
CGGACCCCAG GAGCCAGCCG GAGGAGAGTG TAGGGCTGAG AGCCGAGGGC CACCCTGATA 15 00 
GTCTCAAGGA CAACAGTAGC TGCTCTGTGA TGAGTGAAGA GCCCGAGGGC CGCAGTTACT 156 0 
CCACGCTGAC CACGGTGAGG GAGATAGAAA CACAGACTGA ACTGCTGTCT CCAGGCTCTG 162 0 
GGCGGGCCGA GGAGGAGGAA GATCAGGATG AAGGCATCAA ACAGGCCATG AACCATTTTG 
TTCAGGAGAA TGGGACCCTA CGGGCCAAGC CCACGGGCAA TGGCATCTAC ATCAATGGGC 
GGGGACACCT GGTCTGACCC AGGCCTGCCT CCCTTCCCTA GGCCTGGCTC CTTCTGTTGA 
CATGGGAGAT TTTAGCTCAT CTTGGGGGCC TCCTTAAACA CCCCCATTTC TTGCGGAAGA 1860 
TGCTCCCCAT CCCACTGACT GCTTGACCTT TACCTCCAAC CCTTCTGTTC ATCGGGAGGG 1920 
CTCCACCAAT TGAGTCTCTC CCACCATGCA TGCAGGTCAC TGTGTGTGTG CATGTGTGCC 1980 
TGTGTGAGTG TTGACTGACT GTGTGTGTGT GGAGGGGTGA CTGTCCGTGG AGGGGTGACT 2040 
GTGTCCGTGG TGTGTATTAT GCTGTCATAT CAGAGTCAAG TGAACTGTGG TGTATGTGCC 2100 
ACGGGATTTG AGTGGTTG CG TGGGCAACAC TGTCAGGGTT TGGCGTGTGT GTCATGTGGC 216 0 
TGTGTGTGAC CTCTGCCTGA AAAAGCAGGT ATTTTCTCAG ACCCCAGAGC AGTATTAATG 22 20 
ATGCAGAGGT TGGAGGAGAG AGGTGGAGAC TGTGGCTCAG ACCCAGGTGT GCGGGCATAG 22 80 
CTGGAGCTGG AATCTGCCTC CGGTGTGAGG GAACCTGTCT CCTACCACTT CGGAGCCATG 234 0 
GGGGCAAGTG TGAAGCAGCC AGTCCCTGGG TCAGCCAGAG GCTTGAACTG TTACAGAAGC 24 00 
CCTCTGCCCT CTGGTGGCCT CTGGGCCTGC TGCATGTACA TATTTTCTGT AAATATACAT 246 0 
G CG CCGGGAG CTTCTTGCAG GAATACTGCT CCGAATCACT TTTAATTTTT TTCTTTTTTT 2520 
TTTCTTGCCC TTTCCATTAG TTGTATTTTT TATTTATTTT TATTTTTATT TTTTTTTAGA 2 580 
GATGGAGTCT CACTATGTTG CTCAGGCTGG CCTTGAACTC CTGGGCTCAA GCAATCCTCC 264 0 
TGCCTCAGCC TCCCTAGTAG CTGGGACTTT AAGTGTACAC CACTGTGCCT GCTTTGAATC 2 70 0 
CTTTACGAAG AGAAAAAAAA AATTAAAGAA AGCCTTTAGA TTTATCCAAT GTTTACTACT 
GGGATTGCTT AAAGTGAGGC CCCTCCAACA CCAGGGGGTT AATTCCTGTG ATTGTGAAAG 
GGGCTACTTC CAAGGCATCT TCATGCAGGC AGCCCCTTGG GAGGGCACCT GAGAGCTGGT 
AGAGTCTGAA ATTAGGGATG TGAGCCTCGT GGTTACTGAG TAAGGTAAAA TTGCATCCAC 
CATTGTTTGT GATACCTTAG GGAATTGCTT GG AC CTGGTG ACAAGGGCTC CTGTTCAATA 
GTGGTGTTGG GGAGAGAGAG AGCAGTGATT ATAGACCGAG AGAGTAGGAG TTGAGGTGAG 
GTGAAGGAGG TGCTGGGGGT GAGAATGTCG CCTTTCCCCC TGGGTTTTGG ATCACTAATT 
CAAGGCTCTT CTGGATGTTT CTCTGGGTTG GGGCTGGAGT TCAATGAGGT TTATTTTTAG 3180 
CTGGCCCACC CAGATACACT CAGCCAGAAT ACCTAGATTT AGTACCCAAA CTCTTCTTAG 3 24 0 
TCTGAAATCT GCTGGATTTC TGGCCTAAGG GAGAGGCTCC CATCCTTCGT TCCCCAGCCA 3300 
GCCTAGGACT TCGAATGTGG AGCCTGAAGA TCTAAGATCC TAACATGTAC ATTTTATGTA 3 36 0 
AATATGTGCA TATTTGTACA TAAAATGATA TTCTGTTTTT AAATAAACAG ACAAAACTTG 3420 
TTCAAAAAAA AAAAAAAAAA AAAAAAAAA 

BCX5 Protein sequence (SEP ID NO:6) Prnfpin 
Gene name: LNIR ; Unigene number: Hs. 61460; Probeset Accession #: £ 0 ^!' * ro ^* 371 . ppAM 
Accession #: AF160477; Predicted Signal sequence: 1-26; Predicted TM domains 355 371 PFAM 
domains: IgSF_domain: 45-129, 162-225, 263-317; Summary: A type la TM protein; is a member 
of the immunoglobulin superfamily. 

MPLSLGAEMW G PEAWLLLLL LLASFTGRCP AGELETSDW TWLGQDAKL PCFYRGDSGE 60 
QVGQVAWARV DAG EG AQE LA LLHSKYGLHV SPAYEGRVEQ PPPPRNPLDG SVLLRNAVQA 120 
DEGEYECRVS TFPAGSFQAR LRLRVMVP PL PSLNPGPALE EGQGLTLAAS CTAEGSPAPS 180 
VTWDTEVKGT TSSRSFKHSR SAAVTSEFHL VPSRSMNGQP LTCWSHPGL LQDQRITHIL 24 0 
HVSFLAEASV RGLEDQNLWH IGREGAMLKC LSEGQPPPSY NWTRLDGPLP SGVRVDGDTL 3 00 
GFPPLTTEHS GIYVCHVSNE FSSRDSQVTV DVLDPQEDSG KQVDLVSASV VWGVIAALL 3 60 
FCLLVWWL MSRYHRRKAQ QMTQKYEEEL TLTRENSIRR LHSHHTDPRS QPEESVGLRA 4 20 
EGHPDSLKDN SSCSVMSEEP EGRSYSTLTT VREIETQTEL LSPGSGRAEE EEDQDEGIKQ 4 80 
AMNHFVQENG TLRAKPTGNG I Y I NGRGHLV 

mouse BCX5 Protein sequence (SEQ ID NO: 7) Prnt . pin 
, Gene name: mouse LNIR; Unigene number: n/a; Probeset Accession* : BF168 327 P o ein 
Accession #- n/a? Predicted Signal sequence: 1-27; Predicted TM domains: 346-362, PFAM 
Sins ?gsp!toiaine:44.126.ll6-221. 259-313; Summary: This is the mouse orthologue of human 
BCX5; it is a type la TM protein of unknown function. 

MPLSLGAEMW GPEAWLRLLF LASFTGQYSA GELETSDWT WLGQDAKLP CFYRGDPDEQ 6 0 
VGQVAWARVD PNEXYPGAGL LHSKYGLHVN PAYEDRVEQX XHETFRRSVL LRNAVQADEG 120 



1680 
1740 
1800 



2760 
2820 
2880 
2940 
3000 
3060 
3120 
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EYECRVSTFP SGSFQARMRL RVLVPPLPSL NPGPPLEEGQ ADVAASCTAE GSPAPSVTWD 180 

TEVKGTQSSR SFTHPRSAAV TSEFHLVPSR SMNGQPLTCV VSHPGLLQDR RITHTLQVAF 24 0 

LAEASVRGLE DQNLWQVGRE GATLKCLSEG QPPPKYNWTR LDGPLPSGVR VKGDTLGFPP 3 00 

LTTEHSGVYX CHVSNELSSR DSQVTVEVLD PEDPGKQVDL VSASVIIVGV IAALLFCLLV 36 0 

WWLMSRYH RRKAQQMTQK YEEELTLTRE NSIRRLHSHH SDPRSQPEES VGLRAEGHPD 420 

SLKDNSSCSV MSEEPEGRSY STLTTVREIE TQTELLSPGS GRTEEDDDQD EGIKQAMNHL 48 0 
CRKMGP 

BC26 DNA sequence (SEQ ID NO: 8) 

Gene name: IL-6 receptor beta chain (gpl30; oncostatin M receptor) ; . Umgene number: 

Hs. 82065; Probeset Accession #: M57230 / AA406546; Nucleic Acid Accession #: NM_002184; 

Coding sequence: 256-3012 (start and stop codons underlined) 

GAGCAGCCAA AAGGCCCGCG GAGTCGCGCT GGGCCGCCCC GGCGCAGCTG AACCGGGGGC 6 0 

CGCGCCTGCC AGGCCGACGG GTCTGGCCCA GCCTGGCGCC AAGGGGTTCG TGCGCTGTGG 12 0 

AGACGCGGAG GGTCGAGGCG GCGCGGCCTG AGTGAAACCC AATGGAAAAA GCATGACATT 18 0 

TAGAAGTAGA AGACTTAGCT TCAAATCCCT ACTCCTTCAC TTACTAATTT TGTGATTTGG 24 0 

AAATATCCGC GCAAGATGTT GACGTTGCAG ACTTGGGTAG TGCAAGCCTT GTTTATTTTC 3 00 

CTCACCACTG AATCTACAGG TGAACTTCTA GATCCATGTG GTTATATCAG TCCTGAATCT 36 0 

CCAGTTGTAC AACTTCATTC TAATTTCACT GCAGTTTGTG TG CTAAAGG A AAAATGTATG 420 

GATTATTTTC ATGTAAATGC TAATTACATT GTCTGGAAAA CAAACCATTT TACT ATT C CT 4 80 

AAGGAGCAAT ATACTATCAT AAACAGAACA GCATCCAGTG TCACCTTTAC AGATATAGCT 54 0 

TCATTAAATA TTCAGCTCAC TTGCAACATT CTTACATTCG GACAGCTTGA ACAGAATGTT 600 

TATGGAATCA CAATAATTTC AGGCTTGCCT CCAGAAAAAC CTAAAAATTT GAGTTGCATT 660 

GTGAACGAGG GGAAGAAAAT GAGGTGTGAG TGGGATGGTG GAAGGGAAAC ACACTTGGAG 72 0 

ACAAACTTCA CTTTAAAATC TGAATGGGCA ACACACAAGT TTGCTGATTG CAAAGCAAAA 78 0 

CGTGACACCC CCACCTCATG CACTGTTGAT TATTCTACTG TGTATTTTGT CAACATTGAA 84 0_ 

GTCTGGGTAG AAGCAGAGAA TGCCCTTGGG AAGGTTACAT CAGATCATAT CAATTTTGAT 900 

CCTGTATATA AAGTGAAGCC CAATCCGCCA CATAATTTAT CAGTGATCAA CTCAGAGGAA 96 0 

CTGTCTAGTA TCTTAAAATT GACATGGACC AAGCCAAGTA TTAAGAGTGT TATAATACTA 1020 

AAATATAACA TTCAATATAG GACCAAAGAT GCCTCAACTT GGAGCCAGAT TCCTCCTGAA 1080 

GACACAGCAT CCACCCGATC TTCATTCACT GTCCAAGACC TTAAACCTTT TACAGAATAT 114 0 

GTGTTTAGGA TTCGCTGTAT GAAGGAAGAT GGTAAGGGAT ACTGGAGTGA CTGGAGTGAA 1200 

GAAGCAAGTG GGATCACCTA TGAAGATAGA CCATCTAAAG CACCAAGTTT CTGGTATAAA 1260 

ATAGATCCAT CCCATACTCA AGGCTACAGA ACTGTACAAC TCGTGTGGAA GACATTGCCT 132 0 

CCTTTTGAAG CCAATGGAAA AATCTTGGAT TATGAAGTGA CTCTCACAAG ATGGAAATCA 13 80 

CATTTACAAA ATTACACAGT TAATGCCACA AAACTGACAG TAAATCTCAC AAATGATCGC 144 0 

TATCTAGCAA CCCTAACAGT AAGAAATCTT GTTGGCAAAT CAGATGCAGC TGTTTTAACT 1500 

ATCCCTGCCT GTGACTTTCA AGCTACTCAC CCTGTAATGG AT CTTAAAGC ATTCCCCAAA 156 0 

GATAACATGC TTTGGGTGGA ATGGACTACT CCAAGGGAAT CTGTAAAGAA ATATATACTT 16 20 

GAGTGGTGTG TGTTATCAGA TAAAGCACCC TGTATCACAG ACTGGCAACA AGAAGATGGT 16 80 

ACCGTGCATC GCACCTATTT AAGAGGGAAC TTAGCAGAGA GCAAATGCTA TTTGATAACA 1740 

GTTACTCCAG TATATGCTGA TGGACCAGGA AGCCCTGAAT CCATAAAGGC ATACCTTAAA 18 00 

CAAGCTCCAC CTTCCAAAGG ACCTACTGTT CGGACAAAAA AAGTAGGGAA AAACGAAGCT 186 0 

GTCTTAGAGT GGG AC CAACT TCCTGTTGAT GTTCAGAATG GATTTATCAG AAATTATACT 1920 

ATATTTTATA GAACCATCAT TGGAAATGAA ACTGCTGTGA ATGTGGATTC TTCCCACACA 198 0 

GAATATACAT TGTCCTCTTT GACTAGTGAC ACATTGTACA TGGTACGAAT GGCAGCATAC 204 0 

ACAGATGAAG GTGGGAAGGA TGGTCCAGAA TTCACTTTTA CTACCCCAAA GTTTGCTCAA 2100 

GGAGAAATTG AAGCCATAGT CGTGCCTGTT TGCTTAG CAT TCCTATTGAC AACTCTTCTG 216 0 

GGAGTGCTGT TCTGCTTTAA TAAGCGAGAC CTAATTAAAA AACACATCTG GCCTAATGTT 222 0 

CCAGATCCTT CAAAGAGTCA TATTGCCCAG TGGTCACCTC ACACTCCTCC AAGGCACAAT 2280 

TTTAATTCAA AAGATCAAAT GTATTCAGAT GGCAATTTCA CTGATGTAAG TGTTGTGGAA 234 0 

ATAGAAGCAA ATGACAAAAA GCCTTTTCCA GAAGATCTGA AATCATTGGA CCTGTTCAAA 2400 

AAGGAAAAAA TTAATACTGA AGGACACAGC AGTGGTATTG GGGGGTCTTC ATGCATGTCA 246 0 

TCTTCTAGGC CAAGCATTTC TAGCAGTGAT GAAAATGAAT CTTCACAAAA CACTTCGAGC 2 520 

ACTGTCCAGT ATTCTACCGT GGTACACAGT GGCTACAGAC ACCAAGTTCC GTCAGTCCAA 2580 

GTCTTCTCAA GATCCGAGTC TACCCAGCCC TTGTTAGATT C AG AGG AG CG GCCAGAAGAT 264 0 

CTACAATTAG TAGATCATGT AGATGGCGGT GATGGTATTT TGCCCAGGCA ACAGTACTTC 270 0 

AAACAGAACT G CAGTCAG C A TGAATCCAGT CCAGATATTT CACATTTTGA AAGGTCAAAG 2760 

CAAGTTTCAT CAGTCAATGA GGAAGATTTT GTTAGACTTA AACAGCAGAT TTCAGATCAT 2 82 0 

ATTTCACAAT CCTGTGGATC TGGG CAAATG AAAATGTTTC AGGAAGTTTC TGC AG C AG AT 2 8 80 

GCTTTTGGTC CAGGTACTGA GGGACAAGTA GAAAGATTTG AAACAGTTGG CATGGAGGCT 2 94 0 

GCGACTGATG AAGGCATGCC TAAAAGTTAC TTACCACAGA CTGTACGGCA AGGCGGCTAC 3 000 

ATGCCTCAG T GAA GGACTAG TAGTTCCTGC TACAACTTCA GCAGTACCTA TAAAGTAAAG 3 06 0 
CTAAAATGAT TTTATCTGTG AATTC 

BCZ6 Protein sequence (SEQ ID NO: 9) 

Gene name: IL-6 receptor beta chain (gpl30; oncostatin M receptor); Unigene number: 

Hs. 82065; Probeset Accession #: M57230 / AA406546; Protein Accession #: NP_002175; Predicted 

Signal sequence: 1-22; Predicted TM domains: 625-641; PFAM domains: 

fibronectin type III domains: 222-311, 424-509, 519-606; Summary: A type I TM protein; it 
homodimerizis or~heterodimerized to make a functional receptor for IL-6, oncostatin-M, IL-11, 
LIF, and CNTF . 

MLTLQTWWQ ALFIFLTTES TGELLDPCGY ISPESPWQL HSNFTAVCVL KEKCMDYFHV 60 

NANYIVWKTN HFTIPKEQYT IINRTASSVT FTDIASLNIQ LTCNILTFGQ LEQNVYGITI 120 
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ISGLPPEKPK NLSCIVNEGK KMRCEWDGGR ETHLETNFTL KSEWATHKFA DCKAKRDTPT 180 

SCTVDYSTVY FVNIEVWVEA ENALGKVTSD HINFDPVYKV KPNPPHNLSV INSEELSSIL 240 

KLTWTNPSIK SVIILKYNIQ YRTKDASTWS QIPPEDTAST RSSFTVQDLK PFTEYVFRIR 300 

CMKEDGKGYW SDWSEEASGI TYEDRPSKAP SFWYKIDPSH TQGYRTVQLV WKTLPPFEAN 3 60 

GKI LDYEVTL TRWKSHLQNY TVNATKLTVN LTNDR YLAT L TVRNLVGKSD AAVLTI PACD 4 20 

FQATHPVMDL KAFPKDNMLW VEWTTPRESV KKYILEWCVL SDKAPCITDW QQEDGTVHRT 4 80 

YLRGNLAESK CYLITVTPVY ADGPGSPESI KAYLKQAPPS KG PTVRT KKV GKNEAVLEWD 54 0 

QLPVDVQNGF IRNYTIFYRT IIGNETAVNV DSSHTEYTLS SLTSDTLYMV RMAAYTDEGG 6 00 

KDGPEFTFTT PKFAQGEIEA IWPVCLAFL LTTLLGVLFC FNKRDLI KKH IWPNVPDPSK 6 60 

SHIAQWSPHT PPRHNFNSKD QMYSDGNFTD VSWEIEAND KKPFPEDLKS LDLFKKEKIN 720 

TEGHSSGIGG SSCMSSSRPS ISSSDENESS QNTSSTVQYS TWHSGYRHQ VPSVQVFSRS 780 

ESTQPLLDSE ERPEDLQLVD HVDGGDGILP RQQYFKQNCS QHESSPDISH FERSKQVSSV 84 0 

NEEDFVRLKQ QISDHISQSC GSGQMKMFQE VSAADAFGPG TEGQVERFET VGMEAATDEG 900 
MPKSYLPQTV RQGGYMPQ 

BFG4 DNA sequence (SEQ ID NO:10) t 

Gene name: KIAA0882 protein; Unigene number: Hs. 90419; Probeset Accession #: Z39762; 

Nucleic Acid Accession #: AB020689; Coding sequence: 108-2777 (start and stop codons 

underlined) 

GAACTTATGT AGCCTCATTA TCCCGCTCCG TGAGGTGACA ATTGTGGAAA AGGCAGACAG 60 

CTCCAGTGTG CTCCCCAGTC CCTTATCACA TCAGCACCCG AAACAGGATG ACCTTCCTAT 12 0 

TTGCCAACTT GAAAGATAGA GACTTTCTAG TGCAGAGGAT CTCAGATTTC CTGCAACAGA 180 

CTACTTCCAA AATATATTCT GACAAGGAGT TTGCAGGAAG TTACAACAGT TCAGATGATG 24 0 
AGGTGTACTC TCGACCCAGC AGCCTCGTCT CCTCCAGCCC CCAGAGAAGC ACGAGCTCTG 30 0 
ATGCTGATGG AGAGCGCCAG TTTAACCTAA ATGGCAACAG CGTCCCCACA GCCACACAGA 36 0 
CCCTGATGAC CATGTATCGG CGGCGGTCTC CCGAGGAGTT CAACCCGAAA TTGGCCAAAG 4 20_ 
AGTTTCTGAA AGAGCAAGCC TGGAAGATTC ACTTTG CTG A GTATGGGCAA GGGATCTGCA 4 80 
TGTACCGCAC AGAGAAAACG CGGGAGCTGG TGTTGAAGGG CATCCCGGAG AGCATGCGTG 54 0 
GGGAGCTCTG GCTGCTGCTG TCAGGTGCCA TCAATGAGAA GGC CACACAT CCTGGGTACT 600 
ATGAAGACCT AGTGGAGAAG TCCATGGGGA AGTATAATCT CGCCACGGAG GAGATTGAGA 66 0 
GGGATTTACA CCGCTCCCTT CCAGAACACC CAGCTTTTCA GAATGAAATG GGCATTGCTG 72 0 
CACTAAGGAG AGTCTTAACA GCTTATGCTT TTCGAAATCC CAACATAGGG TATTGCCAGG 78 0 
CCATGAATAT TGTCACTTCA GTGCTG CTGC TTTATGCCAA AGAGGAGGAA GCTTTCTGGC 84 0 
TGCTTGTGGC TTTGTGTGAG CGCATGCTCC CAGATTACTA CAACACCAGA GTTGTGGGTG 90 0 
CACTGGTGGA CCAAGGTGTC TTTGAGGAGC TAGCACGAGA CTACGTCCCA CAGCTGTACG 96 0 

ACTGCATGCA AGACCTGGGC GTGATTTCCA CCATCTCCCT GTCTTGGTTC CTCACACTAT 1020 

TTCTCAGTGT GATG CCTTTT GAGAGTGCAG TTGTGGTTGT TGACTGTTTC TTCTATGAAG 108 0 

GAATTAAAGT GATATTCCAG TTGGCCCTAG CTGTGCTGGA TGCAAATGTG GACAAACTGT 114 0 

TGAACTGCAA GGATGATGGG GAGGCCATGA CCGTTTTGGG AAGGTATTTA GACAGTGTGA 12 00 

CCAATAAAGA CAGCACACTG CCTCCCATTC CTCACCTCCA CTCCTTGCTC AGCGATGATG 1260 

TGGAACCTTA CCCTGAGGTA GACATCTTTA GACTCATCAG AACTTCCTAC GAGAAATTCG 132 0 

GAACTATCCG GGCAGATTTG ATTGAACAGA TGAGATTCAA ACAGAGACTG AAAGTGATCC 1380 

AGACGCTGGA GGATACTACG AAACGCAACG TGGTACGAAC CATTGTGACA GAAACTTCCT 144 0 

TTACCATTGA TGAGCTGGAA GAACTTTATG CT CTTTTCAA GGCAGAACAT CTCACCAGCT 1500 

GCTACTGGGG CGGGAGCAGC AACGCGCTGG ACCGGCATGA CCCCAGCCTG CCCTACCTGG 1560 

AACAGTATCG CATTGACTTC GAGCAGTTCA AGGGAATGTT TGCTCTTCTC TTTCCTTGGG 162 0 

CATGTGGAAC TCACTCTGAC GTTCTGGCCT CCCGCTTGTT CCAGTTATTA GATGAAAATG 16 80 

GAGACTCTTT GATTAACTTC CGGGAGTTTG TCTCTGGGCT AAGTGCTGCA TGCCATGGGG 174 0 

ACCTCACAGA GAAGCTCAAA CTCCTGTACA AAATGCACGT CTTGCCTGAG CCATCCTCTG 1800 

ATCAAGATGA ACCAGATTCT GCTTTTGAAG CAACTCAGTA CTTCTTTGAA GATATTACCC 186 0 

CAGAATGTAC ACATGTTGTT GGATTGGATA GCAGAAGCAA ACAGGGTGCA GATGATGGCT 1920 

TTGTTACGGT GAGCCTAAAG CCAGACAAAG GGAAGAGAGC AAATTCCCAA GAAAATCGTA 1980 

ATTATTTGAG ACTGTGGACT C C AG AAAATA AATCTAAGTC AAAGAATGCA AAGGATTTAC 2 04 0 

CCAAATTAAA TCAGGGGCAG TTCATTGAAC TGTGTAAGAC AATGTATAAC ATGTTCAGCG 2100 

AAGACCCCAA TGAGCAGGAG CTGTACCATG CCACGGCAGC AGTGACCAGC CTCCTGCTGG 216 0 

AGATTGGGGA GGTCGGCAAG TTGTTCGTGG CCCAGCCTGC AAAGGAGGGC GGGAGCGGAG 2220 

GCAGTGGGCC GTCCTGCCAC CAGGGCATCC CAGGCGTGCT CTTCCCCAAG AAAGGGCCAG 228 0 

GCCAGCCTTA CGTGGTGGAG TCTGTTGAGC CCCTGCCGGC CAGCCTGGCC CCCGACAGCG 234 0 

AGGAACACTC CCTTGGAGGA CAAATGGAGG ACATCAAGCT GGAGGACTCC TCGCCCCGGG 24 00 

ACAACGGGGC CTGCTCCTCC ATGCTGATCT CTGACGACGA CACCAAGGAC G AC AG CT CC A 24 60 

TGTCCTCATA CTCGGTGCTG AGTGCCGGCT CCCACGAGGA GGACAAGCTG CACTGCGAGG 2 520 

AAATCGGAGA GGACACGGTC CTGGTGCGGA GCGGCCAGGG CACGGCGGCA CTGCCCCGGA 2 580 

GCACCAGCCT GGACCGGGAC TGGGCCATCA CCTTCGAGCA GTTCCTGGCC TCCCTCTTAA 264 0 

CTGAGCCTGC CCTGGTCAAG TACTTTGACA AGCCCGTGTG CATGATGGCC AGGATTACCA 2 7 00 
GTGCAAAAAA CATCCGGATG ATGGGCAAGC CCCTCACCTC GGCCAGTGAC TATGAAATCT 2 76 0 
CGGCCATGTC CGGCTGACAC GGGCGCCTTC CCGGGGGAGT GGGAGGAGAG GGAGGGGAGG 2 820 
GATTTTTTAT GTTCTTCTGT GTTGAGTTTT TTCTTTCTTT CTTTTAAATT AAATATTTAT 2 8 80 
TAGTACCTGG AATTGAAGCC TAGTGTTTTC ATAATGTAAT TCAATGAAAA CTGTTGGAGA 2 940 
AATATTTAAA CACCTCAATG TAGGTACATT ACACTCTTGT TGCGGGGAGG GGATTTACCA 3 000 
GAATACAGTT TATTTCGTGA ATTCTAAAAA ACAAAAAGAT GAATCTGTCA GTGATATGTG 3 06 0 
TGTATTATAA CTTATTAATC TTGCTGTTGA GCTGTATACA TGGTTTAAAA AATAGTACTG 312 0 
TTTAATGCTA AGTAAGGCAG C AG T CATTTG TGTATTCAGG CTTTTTAAAT AAAATTAGAG 318 0 
CTGTAAGGAA AATGAAAAGC CACAAATGCA AGACTGTTCT TAAATGGAAG GCATAGTCAG 324 0 
CGAGGGTAAA TCCTATACCA CTTTAGGAAG TATTAAAAAT ATTTTTAAGA TTTGAAATAT 3300 
ATTTCATAGA AGTCCTCTAT T CAAAATC AT ATTCCACAGA TGTTCCCCTT CAAAGGGAAA 336 0 
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ACATTTGGGG 
CTTGTATTGA 
TTATTTACGT 
TGAAAGTAGC 
TTTTGAAAGA 
AACATAAGTA 
TACCACATAT 
AAAAAAATCT 
AAAAGAATAG 
CTTTGAATTC 
GATTTTTTCT 
ATATTTCTTT 
CTGCTGTTGT 
TAACCTCAAA 
AAAAAAAAAA 



TTCTAAACAG 
CCAATTTAGA 
CAACTCATTA 
AGCAAAGACA 
GTAAAAGTAC 
AAACAGTCAA 
CGCATATGAC 
GCAAGATAAT 
TATGCTCTAT 
TTGCTTCTTT 
TCTGATTGTT 
CAACACATTC 
TTTTTATTCT 
TAGGCTAAAT 
AAAAA 



TTATGAAAGT 
CAGATACCAG 
GAATTCAGTG 
GAGGGCTCAT 
TGATGCTTCT 
CTTTACCATT 
CATCTTTCCA 
GTAACTGAAT 
TTCCTGAATG 
TTTATTACTG 
GAATTCATAA 
CTTTATTTTA 
ATTTACAGGA 
GTGAACAAAT 



AAGTGATTTT 
ACCAATTTTG 
AAAAGTAACA 
GACAGGTTTT 
GATACTGGAT 
TCCGTATTCT 
TCAAATCAAT 
GTTTT AAAAA 
GATGTGGAAA 
TTATGATTTT 
TCATGGTCTC 
TTATACATTG 
TGATTTTTAA 
AAAATACAGC 



TACATGATTC 
CATTTAAGAA 
GTCTTTTGTC 
TGCTTTTGCT 
GTTTAGCTTC 
CCATAGATTG 
GTAGAGATAA 
CAGAACTTGT 
TGAAAGCTAG 
GCTTTTTACA 
ATTTCCTTTG 
TGTCCTTTTT 
ACTGTCAAAT 
AAATACTCAG 



CAGAATAACA 
ATTGTTCTGA 
ACAGAGAATC 
TTGCTTTTGT 
TTACTG C AAA 
AAGAAATTTA 
TGTAAACTGA 
CACTTTATAT 
CGCACCTGCA 
GATGTTGGAC 
CTTCTTTGGA 
TTAGCTATTG 
GAAGTAGTGT 
AAAAAAAAAA 



3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 



Probeset Accession # : 



BFG4 Protein sequence (SEQ ID NO:ll) 

Gene name: KIAA0882 protein; Unigene number: Hs . 90419 

Protein Accession #: BAA74 905; Signal sequence: none; Predicted TM domains 
domains: TBC_domain: 135-347; Summary: a Type II membrane protein 
peroxisome . 



Z39762; 
302-318; PFAM 
likely localized to the 



MTFLFANLKD 
STSSDADGER 
QGICMYRTEK 
EEIERDLHRS 
EAFWLLVALC 
FLTLFLSVMP 
LDSVTNKDST 
LKVIQTLEDT 
LPYLEQYRID 
ACHGDLTEKL 
ADDGFVTVSL 
NMFSEDPNEQ 
KKGPGQPYW 
DDSSMSSYSV 
ASLLTEPALV 



RDFLVQRISD 
QFNLNGNSVP 
TRELVLKGIP 
LPEHPAFQNE 
ERMLPDYYNT 
FESAWWDC 
LPPIPHLHSL 
TKRNWRTIV 
FEQFKGMFAL 
KLLYKMHVLP 
KPDKGKRANS 
E L YHAT AAVT 
ESVEPLPASL 
LSAGSHEEDK 
KYFDKPVCMM 



FLQQTTSKIY 
TATQTLMTMY 
ESMRGELWLL 
MG I AALRRVL 
RWGALVDQG 
FFYEGIKVIF 
LSDDVEPYPE 
TETSFTIDEL 
LFPWACGTHS 
EPSSDQDEPD 
QENRNYLRLW 
SLLLEIGEVG 
APDSEEHSLG 
LHCEEIGEDT 
ARITSAKNIR 



SDKEFAGSYN 
RRRSPEEFNP 
LSGAINEKAT 
TAYAFRNPNI 
VFEELARDYV 
QLALAVLDAN 
VDIFRLIRTS 
EELYALFKAE 
D VLAS RLFQL 
SAFEATQYFF 
TPENKSKSKN 
KLFVAQPAKE 
GQMEDIKLED 
VLVRSGQGTA 
MMGKPLTSAS 



SSDDEVYSRP 
KLAKEFLKEQ 
HPGYYEDLVE 
GYCQAMNIVT 
PQLYDCMQDL 
VDKLLNCKDD 
YEKFGTIRAD 
HLTSCYWGGS 
LDENGDSL.IN 
EDITPECTHV 
AKDLPKLNQG 
GGSGGSGPSC 
SSPRDNGACS 
ALPRSTSLDR 
DYEISAMSG 



SSLVSSSPQR 
AWKIHFAEYG 
KSMGKYNLAT 
SVLLLYAKEE 
GVISTISLSW 
GEAMTVLGRY 
LIEQMRFKQR 
SNALDRHDPS 
FREFVSGLSA 
VGLDSRSKQG 
QFIELCKTMY 
HQGIPGVLFP 
SMLISDDDTK 
DWAITFEQFL 



60 
120 
180 
240 
300_ 
360 
420 
480 
540 
600 
660 
720 
780 
840 



BCU7 DNA sequence (SEQ ID NO: 12) 
Gene name: EST; Unigene number: Hs. 98558 
Accession #: n/a; Coding sequence: 1-573 (stop codon underlined) 



Probeset Accession #: AA428062; Nucleic Acid 



TATTTTATTT 
TTTTTCACAG 
CAGAAATGGT 
TTTTCATGTG 
CGATCCTCTC 
CAATATAGGT 
AAACTGATGG 
CTGGTTATGA 
TTATTGCTTC 
TTGCCCTACA 
TTCATTTGAA 
ATTTACACAG 
GAAACCAACT 
TGTTGATGCC 
AAGGAAAAAT 
TCTGCAGTAT 
TTAGGAAGGA 
ATAAAAATTT 
CATGCAAAAT 
AAAAACTTAG 
CACATATCCT 
GAACTGTCAG 
TGGCTTTAGT 
ATTCAGAAAT 
ACTCTGTTGT 
TTCTTTGGGA 
ATATCACCTA 
TCTGGTGGTT 
GTACCATAAA 
ACCTAAAATG 
GGCACCCACT 
TGCTTTTTTT 
TTTATAAACA 
AGTGATCATT 



TCCAGGCTAA 
CATGCAACAA 
CTGGGTATAA 
GAAAGGTGTC 
TGTGGTGTTA 
TTTGGGGCTT 
ATTTCATTAT 
CAGATAAGTT 
TGTTCAGTTG 
AATTTCATTT 
AGTTCCATGC 
AATATTTGAA 
TATCTGCATA 
ATGCTTATCA 
AATGCTTAGA 
CGAAGGATGC 
TAAAGTTACA 
GTCATTATTT 
AATTTTTTAA 
AAGGTTTAAC 
TGCTATAAGT 
TATCTATTCT 
AATGCTCCTT 
ATCAATAAAA 
TTTAATTCAA 
GTTATTTTCT 
AACTGGTTAG 
ATGAAGACAA 
AATTTGTTGT 
TTTGATAATA 
TTCCTAGTAA 
TTTCTGAATT 
CAATTAGAAT 
TATTTTTATT 



AGCAAATGAA 
TGGTGCTAGG 
GCTATTTCAT 
ATTCAAAAAA 
ATTTTTTTAT 
C CAT ATCATC 
AGAATTATCT 
TGCTCAAAAA 
TCTAAGTATC 
GGCAGCGCCA 
AGCTTTAGCA 
TTGAAACAAT 
ATTAAATCTA 
AATACATGCA 
CTTTGGTGTA 
AAATGTTGAC 
TTTGTCTTAA 
ATGCTGGTGA 
ATTATGTTAT 
ACTTCACTGA 
TTTGAAGTTT 
GGTGCTAAAT 
TTATTCATTG 
TCCTATGTTA 
CAGTCCAACA 
TTTTAAAATC 
ATTACTTCTA 
ATTCTTAATG 
GCAATATTAG 
TTAACATGTA 
TGTTTTC CAT 
AATTAGGTAT 
TACAATTAAT 
TAGCACAGGT 



AGTTTGCTGG 
ATAG CTATTT 
AAAAGCAGCT 
AAAGTAATTG 
ATGACCAGTA 
AAAAGACTGA 
GTGAGTTGTG 
ATGTGGATGA 
ATCCCTTCTG 
TAA CATTCAT 
CAGAGTTGAC 
AGAAATTTTT 
ATACATATTT 
CAAGCTAAAC 
GGTTCTTCCT 
ATAGATGGAA 
TTTCTAACAT 
AACGTATAAT 
TGTTTAAATT 
TTAATGGTGC 
CTTAGCAATT 
GTATGGTGCT 
CTAAATTTAG 
AATTAATCTT 
TTATTTAGGT 
TTTTTATAGC 
CAGCTAATAA 
GCTACTTGAC 
AATTATCATA 
TCTAAGAGGA 
GATTTTCCAG 
TGGTAAAATA 
TAACAGAGGT 
CATAAGAAAA 



TATCAACACA 
CTTACTGTAA 
TTAAATTGTC 
GCATACATAT 
GAAAAATTTT 
AAAATTATAA 
TAGACACAGT 
AGCCATTATT 
TGGCCCATCA 
TTAAAAAGTT 
CAAACACTGG 
CTCATAATAT 
AAGCCAGTTT 
ATAATTTGAA 
GTGT AGC CAT 
GCTCTTACCT 
TATCTTTGCT 
CACATCCAAT 
TGACTTATGG 
TGAAAACACG 
AAAGTTTTTT 
AAATGAATTG 
TGTTATCCAT 
TACCAAAAAC 
GTTACAGAGT 
TTGG CAATGT 
TATTGCAGGC 
CTACAGCAAA 
TGTTTCCTAC 
AAAAAGAGTT 
TTCTGAGGCA 
TATTTTTAAA 
ATAATTGTCT 
ATATATAGAA 



GCCTGCCATA 
TTGCCAGAGG 
AGTATTAAGG 
TCCACATCAT 
AATATTCTCA 
TTTTAGAATT 
CTTAATGTTT 
GTTATTATTG 
CGCAGCAGAG 
TATGAAAACA 
CGTAAGTTCA 
ATACCTATGT 
AAGTGCTTTG 
TGGGTCTATG 
ATACCCAGGC 
ACCAAAGTGT 
TTTATGTTTC 
TATTTGAACA 
GAGATCAGTC 
TTACAATTAC 
TATTCAGTGT 
TTAGTGTTGA 
TTGATTCCTG 
AGG C AAGTT A 
GTAAATATAT 
CCAAAGTCAA 
ACTGGCGCCC 
AGCCATTTCT 
ATCTGACAGC 
AATATATTCT 
CTTATTAAAG 
TTTAGTTAGC 
CACTTTCAGA 
AAATAATCAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
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TTTCATATAT AAAAGGATTA TTTCTCCACC TTTAATTATT GGCCTATCAT TTGTTAGTGT 2100 

TATTTGGTCA TATTATTGAA CTAATGTATT ATTCCATTCA AAGTCTTTCT AGATTTAAAA 2160 

ATGTATGCAA AAGCTTAGGA TTATATCATG TGTAACTATT ATAGATAACA TCCTAAACCT 22 20 

TCAGTTTAGA TATATAATTG ACTGGGTGTA ATCTCTTTTG TAATCTGTTT TGACAGATTT 22 80 

CTTAAATTAT GTTAGCATAA TCAAGGAAGA TTTACCTTGA AGCACTTTCC AAATTGATAC 2340 

TTTCAAACTT ATTTTAAAGC AGTAGAACCT TTTCTATGAA CTAAATCACA TGCAAAACTC 24 00 

CAACCTGTAG TATACATAAA ATGGACTTAC TTATTCCTCT CACCTTCTCC AGTGCCTAGG 24 60 

AATATTCTTC TCTGAGCCCT AGGATTGATT CTATCACACA GAGCAACATT AATCTAAATG 2520 

GTTTAGCTCC CTCTTTTTTC TCTAAAAACA ATCAGCTAAT AAAAAAAAAA TTTGAGGGCC 25 80 

TAAATTATTT CAATGGTTGT TTGAAATATT CAGTTCAGTT TGTACCTGTT AGCAGTCTTT 264 0 

CAGTTTGGGG GAGAATTAAA TACTGTGCTA AGCTGGTGCT TGGATACATA TTACAGCATC 2 700 

TTGTGTTTTA TTTGACAAAC AGAATTTTGG TGCCATAATA TTTTGAGAAT TAGAGAAGAT 276 0 

TGTGATGCAT ATATATAAAC ACTATTTTTA AAAAATATCT AAATATGTCT CACATATTTA 2 820 

TATAATCCTC AAATATACTG TACCATTTTA GATATTTTTT AAACAGATTA ATTTGGAGAA 2 8 80 

GTTTTATTCA TTACCTAATT CTGTGGCAAA AATGGTGCCT CTGATGTTGT GATATAGTAT 294 0 

TGTCAGTGTG TACATATATA AAACCTGTGT AAACCTCTGT CCTTATGAAC CATAACAAAT 30 00 

GTAGCTTTTT AAAGTCCATT GTATTGTTTT TTCTTTCAAT AAAAGAGTAT AATTAATTGG 3 06 0 
TTGTTTTTGA 



BCU7 Protein sequence (SEQ ID NO: 13) 

Gene name: EST; Unigene number: Hs. 98558; Probeset Accession #: AA428062; Protein Accession 
#: n/a; Signal sequence: none; Predicted TM domains: 125-141, 154-170; PFAM domains: none; 
Summary': A type III membrane protein, highly overexpressed in breast cancer and prostate 
cancer ; unknown f unct ion . 

YFIFQAKANE SLLVSTQPAI F FT ACNNGAR IAISYCNCQR QKWSGYKLFH KSSFKLSVLR 60 

FSCGKVSFKK KVIGIHIPHH RSSLWCXFFY MTSRKILIFS QYRFWGFHII KRLKNYNFRI 120_ 

KLMDFI IELS VSCVDTVLMF LVMTDKFAQK MWMKPLLLLL LLLLFSCLSI I PSVAHHAAE 180 
LPYKFHLAAP 

BFA1 DNA sequence (SEQ ID NO: 14) 

Gene name: calsyntenin-2 ; Unigene number: Hs.7413; Probeset Accession #: R46025; Nucleic 

Acid Accession #: NM_022131; Coding sequence: 11-2878 (start and stop codons underlined) 

TGCTGCGAGG ATGCTGCCTG GGCGGCTGTG CTGGGTGCCG CTCCTGCTGG CGCTGGG CGT 6 0 

GGGGAG CGGC AGCGGCGGTG GCGGGGACAG CCGGCAGCGC CGCCTCCTCG CGGCTAAAGT 120 

CAATAAGCAC AAGCCATGGA TCGAGACTTC ATATCATGGA GTCATAACTG AGAACAATGA 180 

CACAGTCATT TTGGACCCAC CACTGGTAGC CCTGGATAAA GATGCACCGG TTCCTTTTGC 24 0 

AGGGGAAATC TGTGCGTTCA AGATCCATGG CCAGGAGCTG CCCTTTGAGG CTGTGGTGCT 300 

CAACAAGACA TCAGGAGAGG GCCGGCTCCG TGCCAAGAGC CCCATTGACT GTGAGTTGCA 36 0 

GAAGGAGTAC ACATTCATCA TCCAGGCCTA TGACTGTGGT GCTGGGCCCC ACGAGACAGC 420 

CTGGAAAAAG TCACACAAGG CCGTGGTCCA TATACAGGTG AAGGATGTCA ACGAGTTTGC 4 80 

TCCCACCTTC AAAGAGCCAG CCTACAAGGC TGTTGTGACG GAGGGCAAGA TCTATGACAG 54 0 

CATTCTGCAG GTGGAGGCCA TTGACGAGGA CTGCTCCCCA CAGTACAGCC AGATCTGCAA 600 

CTATGAAATC GTCACCACAG ATGTGCCTTT TGCCATCGAC AGAAATGGCA ACATCAGGAA 660 

CACTGAGAAG CTGAGCTATG ACAAACAACA CCAGTATGAG ATCCTGGTGA CCGCCTACGA 720 

CTGTGGACAG AAGCCCGCTG CTCAGGACAC CCTGGTGCAG GTGGATGTGA AG CC AGTTTG 780 

CAAGCCTGGC TGGCAAGACT GGACCAAGAG GATTGAGTAC CAGCCTGGCT CCGGGAGCAT 84 0 

GCCCCTGTTC CCCAGCATCC ACCTGGAGAC GTGCGATGGA GCCGTGTCTT CCCTCCAGAT 900 

CGTCACAGAG CTGCAGACTA ATTACATTGG GAAGGGTTGT GACCGGGAGA CCTACTCTGA 960 

GAAATCCCTT CAGAAGTTAT GTGGAGCCTC CTCTGGCATC ATTGACCTCT TGCCATCCCC 102 0 

TAGCGCTGCC ACCAACTGGA CTGCAGGACT GCTGGTGGAC AGCAGTGAGA TGATCTTCAA 1080 

GTTTGACGGC AGGCAGGGTG CCAAAATCCC CGATGGGATT GTGCCCAAGA ACCTGACCGA 114 0 

TCAGTTCACC ATCACCATGT GGATGAAACA CGGCCCCAGC CCTGGTGTGA GAGCCGAGAA 1200 

GGAAACCATC CTCTGCAACT CAGACAAAAC CGAAATGAAC CGGCATCACT ATGCCCTGTA 126 0 

TGTGCACAAC TGCCGCCTCG TCTTTCTCTT GCGGAAGGAC TTCGACCAGG CTGACACCTT 1320 

TCGCCCCGCG GAGTTCCACT GGAAGCTGGA TCAGATTTGT GACAAAGAGT GGCACTACTA 138 0 

TGTCATCAAT GTGGAGTTTC CTGTGGTAAC CTTATACATG GATGGAGCAA CATATGAACC 144 0 

ATACCTGGTG ACCAACGACT GGCCCATTCA TCCATCTCAC ATAGCCATGC AACT CACAGT 1500 

CGGCGCTTGT TGGCAAGGAG GAGAAGTCAC CAAACCACAG TTTGCTCAGT TCTTTCATGG 156 0 

AAGCCTGGCC AGTCTCACCA TCCGCCCTGG CAAAATGGAA AG CCAGAAGG TG AT CTCCTG 162 0 

CCTGCAGGCC TGCAAGGAAG GGCTGGACAT TAATTCCTTG GAAAGCCTTG GCCAAGGAAT 16 8 0 

AAAGTATCAC TTCAACCCCT CGCAGTCCAT CCTGGTGATG GAAGGTGACG ACATTGGGAA 174 0 

CATTAACCGT GCTCTCCAGA AAGTCTCCTA CATCAACTCC AGGCAGTTCC CAACGG CGGG 1800 

TGTGCGGCGC CTCAAAGTAT CCTCCAAAGT CCAGTGCTTT GGGGAAGACG TATGCATCAG 1860 

TATCCCTGAG GTAGATGCCT ATGTGATGGT CCTCCAGGCC ATCGAGCCCC GGATCACCCT 192 0 

CCGGGGCACA G AC C ACTTCT GGAGACCTGC TGCCCAGTTT G AAAGTG CCA GGGGAGTGAC 1980 

CCTCTTCCCT GATATCAAGA TTGTGAGCAC CTTCGCCAAA ACCGAAGCCC CCGGGGACGT 204 0 

GAAAACCACA GACCCCAAAT CAGAAGTCTT AGAGGAAATG CTTCATAACT TAGATTTCTG 2100 

TGACATTTTG GTGATCGGAG GGGACTTGGA CCCAAGGCAG GAGTGCTTGG AGCTCAACCA 2160 

CAGTGAGCTC CACCAACGAC ACCTGGATGC CACTAATTCT ACTGCAGGCT ACTCCATCTA 222 0 

CGGTGTGGGC TCCATGAGCC GCTATGAGCA GGTGCTACAT CACATCCGCT ACCGCAACTG 2280 

GCGTCCGGCT TCCCTTGAGG CCCGGCGTTT CCGGATTAAG TGCTCAGAAC TCAATGGGCG 234 0 

CTACACTAGC AATGAGTTCA ACTTGGAGGT CAGCATCCTT CATGAAGACC AAGTCTCAGA 24 00 

TAAGGAGCAT GTCAATCATC TGATTGTGCA GCCTCCCTTC CTCCAGTCTG TCCATCATCC 24 60 

TGAGTCCCGG AGTAGCATCC AGCACAGTTC AGTGGTCCCA AGCATTGCCA CAGTGGTCAT 2 52 0 
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CATCATCTCC 
CGCCCACCAG 
CGATTCTGCG 
GGAAGATGAG 
TGGCTCTGAC 
GCAGAATGGA 
CCCAGGGGTC 
CATGTCTATC 
TCCTGGAGCC 
AAGAAGCCCA 
CTTCTGCCCT 
TCCTGCAGGG 
GGCCCTGGGG 
CTGACTCCAG 
TGCTTAACAG 
CACACATTCT 
CCCCTTGCTT 
TC CTTGGCC A 
CACCTGCTGC 
CCAGGAGGCA 
GGACAACAAG 
AAG C AG ATGG 
GCTGGGCTCC 
TATGGTCCCT 
AAACGCACGA 
AAGCATATTT 
TAGATAGAGG 
TTTCCATCTC 
CTGTGATCTT 
ACGAAGCTCT 
TGCTCTGTCC 
AGAGCTGGTC 



GTGTGCATGC 
CACTTCATCC 
CTGACTATCA 
ACTGAGGGAG 
GACAGCGAAG 
GCCAGGCAAG 
TGCTGCCTGG 
ATACCTCACC 
CACCCTTTAA 
GCATGGCCAT 
AAGTTCCCCA 
AAGAAGGCCC 
TTCCAACTCA 
GTTGCTTCAT 
TTTAAAGGAA 
CTCTCTCTCT 
CTTTCTGAGG 
CAAGCAGGGT 
AACCATGCAG 
TTCCACACAG 
GACACAACAC 
AAATGCTAAT 
CCCAGGACAG 
TATCTCCTAT 
GCTCCACCAA 
GCAATCATTG 
GCAAATATAT 
CATCCTAACA 
TTGTGGCCAA 
GAGAACATGT 
CCATCCTTCA 
CCTAGTTAAG 



TTGTGTTTGT 
AGGAGACTGA 
CAGTCAACCC 
AAGAGGAGGA 
AGGAGGAGGA 
CCCAGCTGGA 
CCCACATGTC 
TCTGATGTCT 
GCCTTGGGCA 
CAGTGAGGAC 
GCATCCTGAC 
ACCTTTGTGT 
CTGTGCGTCT 
ACAAGGAGGG 
AGTC CTTGTT 
CTCTCTCTGT 
C CAT AT AAGC 
CTGATCCCCC 
CTACCCTGCC 
GCACTGCCCC 
AACACACAAC 
GAGGTCAAAC 
AGGGGACCCT 
CTTCCCTCTT 
GTCTACAATG 
CAGCTTCTTC 
CTGAAAACCT 
TGCACAACCT 
GAGAATAGCA 
TTGTTTCGAA 
CTCCTCCTCA 
TGGCATTTAT 



CGTGGCCATG 
GGCTGCCAAG 
CATGGAGAAA 
AG AAG CCGAG 
GGAGGAAGGG 
GTGGGATGAC 
CCTTTTGTAA 
GTGACATGTC 
CTCCCTGTGT 
TTCAGGGTAG 
TACCTGTCTG 
CACTCACCTC 
CCTCCACACA 
TGGTTGAACT 
GAGGCAGAAC 
CTATCTAGTT 
TTATAAGAAA 
ATCAGAGCTA 
AGGGGCACTC 
AGGACAACAC 
AAGGACAGTC 
GTAGGCTTCA 
GAGGTTGGCA 
GAGAAAATAC 
AAAGTTTGAA 
TTTCTTCTGC 
AATTTCTTTC 
GTGAAGAGAA 
GGCAAGAATT 
TGTCTGATTC 
AGCTCACACC 
GTTAAAAAAA 



GGTGTGTACC 
GAATCTGAGA 
CATGAAGGAC 
GAAGAAATGA 
ATGGGCAGAG 
TCCACCCTCC 
ACCCTGACCC 
TGGGAAGGCC 
TTCATCCATG 
ACTTTGTCCT 
CAGAGTTTGC 
CCCAGGCTCA 
GACCAGTAGG 
TCACACACGT 
T AAG TTTACA 
CCCCAGCTTG 
AGTCCCAAAC 
TCTGAGCCTG 
AG CAAACAG A 
AACAAGGACA 
ACAACAAGCC 
TGGTGGGTGG 
AGGCTCTCAC 
ACGCTTTCTG 
ATTTAACTGC 
TCATAAAAGG 
TTTTTTTGAT 
TTGTTTCTAT 
AGGGCCTTGA 
CTCTTTGTCA 
AATTGGTTTG 
A 



GGGT CCGG AT 
TGGACTGGGA 
CAGGGCATGG 
GCTCCAGCAG 
GCAGACATGG 
CCTAC TAG TG 
AGTGTATGCC 
TTCTCCAGCT 
GGGAAGTTCC 
GTAGCCTCCA 
CTTTGTTTTT 
GAGTCCCCAA 
TTCTCCTATG 
AAGGTCTTAG 
GGGAAAGGTA 
GAGAGCCTTT 
CAAGAATAGG 
CCTGTCTGGG 
ACCACAGGGC 
GTCACAACAA 
TAGAGCCAGA 
AGTGGGGGTG 
CACTCAGCCT 
CATGTATTAG 
AAGGAATTAG 
AGGAACACTT 
AAGGAAATCT 
AGTAACTGGT 
CAGAATTTCC 
TCAATGTGTA 
GCACAGGCAC 



2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
.3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
*4200" 
4260 
4320 
4380 



BFA1 Protein sequence (SEQ ID NO: 15) 

Gene name: calsyntenin- 2 ; Unigene number : Hs-7413; Probeset Accession #: R46025; Protein 
Accession #: NP_071414; Predicted Signal sequence: 1-20; Predicted TM domains: 832-848; 
PFAM domains: cadherin_domains : 48-151, 165-254; Summary: A type I membrane protein; a 
member of the calsyntenin family; is related to the FAT tumor suppressor; is likely an 
adhesion molecule important in mammalian developmental processes and cell communication. 



ML PG RLCWV P 
LDPPLVALDK 
TFIIQAYDCG 
VEAIDEDCSP 
KPAAQDTLVQ 
LQTNYIGKGC 
RQGAKIPDGI 
CRLVFLLRKD 
TNDWPIHPSH 
CKEGLDINSL 
LKVSSKVQCF 
DIKIVSTFAK 
HQRH LDATNS 
NEFNLEVSIL 
VCMLVFWAM 
TEGEEEEEAE 



LLLAI/GVGSG 
DAP VP FAGE I 
AGPHETAWKK 
QYSQICNYEI 
VDVKPVCKPG 
DRETYSEKSL 
VPKNLTDQFT 
FDQADTFRPA 
I AMQLTVG AC 
ESLGQGIKYH 
GEDVCISIPE 
TEAPGDVKTT 
TAGYSIYGVG 
HEDQVSDKEH 
GVYRVRIAHQ 
EEMSSSSGSD 



SGGGGDSRQR 
CAFKIHGQEL 
SHKAWHIQV 
VTTD VP FA I D 
WQDWTKRIEY 
QKLCGASSGI 
ITMWMKHGPS 
EFHWKliDQIC 
WQGGEVTKPQ 
FNPSQSILVM 
VDAYVMVIiQA 
DPKSEVLEEM 
SMS RYE QVTjH 
VNHLIVQPPF 
HFIQETEAAK 
DSEEEEEEEG 



RLLAAKVNKH 
P F EAWLNKT 
KDVNEFAPTF 
RNGNIRNTEK 
QPGSGSMPLF 
IDLLPSPSAA 
PGVRAEKETI 
DKEWHYYVIN 
FAQFFHGSLA 
EGDDIGNINR 
IEPRITLRGT 
LHNLDFCDIL 
HIRYRNWRPA 
LQSVHHPESR 
ESEMDWDDSA 
MGRGRHGQNG 



KPWIETSYHG 
SGEGRLRAKS 
KEPAYKAWT 
LSYDKQHQYE 
PSIHLETCDG 
TNWTAGLLVD 
LCNSDKTEMN 
VEFPWTLYM 
SLTIRPGKME 
AliQKVSYINS 
DHFWRPAAQF 
VIGGDLDPRQ 
SLEARRFRIK 
SSIQHSSWP 
LTITVNPMEK 
ARQAQLEWDD 



VITENNDTVI 
PIDCELQKEY 
EGKIYDSILQ 
ILVTAYDCGQ 
AVSSIiQIVTE 
SSEMIFKFDG 
RHHYAliYVHN 
DGATYEPYLV 
SQKVISCLiQA 
RQFPTAGVRR 
ESARGVTIjFP 
ECLELNHSEL 
CSELNGRYTS 
SIATWIIIS 
HEGPGHGEDE 
STLPY 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



BFG7 DNA sequence (SEQ ID NO: 16) 

Gene name: EST; Unigene number: Hs. 91668; Probeset Accession #: 
Accession #: n/a; Coding sequence: <l-906 (stop codon underlined) 



CGGGTCGACC 
CTAGCTGGGG 
GACTGCGTAC 
TCCCGCCAGC 
GAGTGTATGT 
CATGGCAAGT 
TCGTTTCTCA 
GCCTCCTCCC 
TTCTGGTCCA 
TGTGCCTC C A 
GACTATGGCT 
CTGGCCTGGT 
GTCTTGCTGC 
GTCCTGGATG 
AGCTTTCTGG 



CACGCGTCCG 
CAGCGGCGCT 
TGCAGTGCGA 
CAATCTACAT 
GGGTCACCGT 
GGCCCTTCTC 
ATGG CCTGGC 
CCATGTACCA 
CAGTYTTCCA 
CTGTCATCCT 
ACAACCTGGT 
GCCTGTGGAA 
TGCAGGGGCT 
CCCATGCCAT 
AAGATGACAG 



GGGAGAAAGG 
GGCGAGCGGC 
AGAGCAGAAC 
GAGTCTAGCA 
TGGGCTCTAC 
CCGGTTCCTG 
CAGCCTGGTG 
CACCTGTGTG 
CACCAGGGAC 
ACACTCAATC 
GGCCAACGTG 
CCAGCGGCGG 
GTCCCTGCTC 
CTGGCACATC 
CCTGTACCTG 



ATGG CCGGCC 
TCCCAGGGCG 
TGCT CTGGGG 
GGCTGGACCT 
CTCCAGGAAG 
TTCTTT C AAG 
ATGCTCTGCC 
GCCTTCGCCT 
ACTGACCTCA 
TACCTGTGCT 
GCTATTGGCC 
CTGCCTCACG 
GAGCTGCTTG 
AGCACCATCC 
CTGAAGGAAT 



TGGCGGCGCG 
ACCGTGAGCC 
GCGCTCTGAA 
GTCGGGACGA 
GTCACAAAGT 
AGCCGGCATC 
GCTACCGCAC 
GGGTGTCCCT 
CAGAGAAAAT 
GCGTCAGCCT 
TGGTCAACGT 
TGCGCAAGTG 
ACTTCCCACC 
CTGTCCACGT 
CAGAGGACAA 



GTTGGTCCTG 
GGTGTACCGC 
TCACTTCCGC 
CTGTAAGTAT 
GCCTCAGTTC 
GGCCGTGGCC 
CTTCGTG CCA 
CAATGCATGG 
GGACTACTTC 
CATCCGCTTC 
GGTGTGGTGG 
CGTGGTGGTG 
GCTCTTCTGG 
CCTCTTTTTC 
GTTCAAGCTG 



Z40805; Nucleic Acid 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
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GACTGAAGAC CTTGGAGCGA GTCTGCCCCA GTGGGGATCC TGCCCCCGCC CTGCTGGCCT 96 0 

CCCTTCTCCC CTCAACCCTT GAGATGATTT TCTCTTTTCA ACTTCTTGAA CTTGGACATG 1020 

AAGGATGTGG GCCCAGAATC ATGTGGCCAG CCCACCCCCT GTTGGCCCTC ACCAGCCTTG 1080 

GAGTCTGTTC TAGGGAAGGC CTCCCAGCAT CTGGGACTCG AGAGTGGGCA GCCCCTCTAC 114 0 

CTCCTGGAGC TGAACTGGGG TGGAACTGAG TGTGCTCTTA GCTCTACCGG GAGGACAGCT 1200 

GCCTGTTTCC TCCCCATCAG CCTCCTCCCC ACATCCCCAG CTGCCTGGCT GGGTCCTGAA 126 0 

GCCCTCTGTC TACCTGGGAG ACCAGGGACC ACAGGCCTTA GGGATACAGG GGGTCCCCTT 13 20 

CTGTTACCAC CCCCCACCCT CCTCCAGGAC ACCACTAGGT GGTGCTGGAT GCTTGTTCTT 1380 

TGGCCAGCCA AGGTTCACGG CGATTCTCCC CATGGGATCT TGAGGGACCA AG CTGCTGGG 144 0 

ATTGGGAAGG AGTTTCACCC TGACCRTTGC CCTAGCCAGG TTCCCAGGAG GCCTCACCAT 1500 

ACTCCCTTTC AGGGCCAGGG CTCCAGCAAG CCCAGGGCAA GGATCCTGTG CTGCTGTCTG 156 0 

GTTGAGAGCC TGCCACCGTG TGTCGGGAGT GTGGGCCAGG CTG AG TGCAT AGGTGACAGG 16 20 

GCCGTGAGCA TGGG CCTGGG TGTGTGTGAG CTCAGGCACT AGGTGCGCAG TGTGGAGACG 16 8 0 

GGTGTTGTCG GGGAAGAGGT GTGGCTTCAA AGTGTTGTGT GTGCAGGGGG TKGGTGTGTT 174 0 

AAGCGTGGGT TAGGGGAACG TGTGTGCGCG TGCTGGTGGG CATGTGAGAT GAGTGACTGC 1800 

CGGTGAATGT GTCCACAGTT GAGAGGTTGG AGCAGGATGA GGGAATCCTG TCACCATCAA 186 0 

TAATCACTTG TGGAGCGCCA CTTGGCCCAA GACGCCACCT GGGCGGACAG CAGGAGCTCT 1920 

CCATGGCCAG GCTGCCTGTG TGCATGTTCC CTGTCTGGTG CCCCTTTGCC CGCCTCCTGC 198 0 

AAACCTCACA GGGTCCCCAC ACAACAGTGC CCTCCAGAAG CAGCCCCTCG GAGGCAGAGG 2 04 0 

AAGGAAAATG GGGATGGCTG GGGCTCTCTC CATCCTCCTT TTCTCCTTGC CTTCGCATGG 2100 

CTGGCCTTCC CCTCCAAAAC CTCCATTCCC CTGCTGCCAG CCCCTTTGCC ATAGCCTGAT 216 0 

TTTGGGGAGG AGGAAGGGGC GATTTGAGGG AGAAGGGGAG AAAGCTTATG GCTGGGTCTG 222 0 

GTTTCTTCCC TTCCCAGAGG GTCTTACTGT TCCAGGGTGG CCCCAGGGCA GGCAGGGGCC 22 80 

ACACTATGCC TGCGCCCTGG TAAAGGTGAC CCCTGCCATT TACCAGCAGC CCTGGCATGT 234 0 

TCCTGCCCCA CAGGAATAGA ATGGAGGGAG CTCCAGAAAC TTTCCATCCC AAAGGCAGTC 24 00 

TCCGTGGTTG AAGCAGACTG GATTTTTGCT CTGCCCCTGA CCCCTTGTCC CTCTTTGAGG 246 0 

GAGGGGAGCT ATGCTAGGAC TCCAACCTCA GGGACTCGGG TGGCCTGCGC TAGCTTCTTT 2520 

TGATACTGAA AACTTTTAAG GTGGGAGGGT GGCAAGGGAT GTG CTTAAT A AATCAATTCC 2 580 
AAGCCTCAAA AAAAAAAAAA AAAAAAAAAA AAAAAA 



BFG7 Protein sequence (SEQ ID NO: 17) 

Gene name: EST; Unigene number: Hs. 91668; Probeset Accession #: Z40805; Protein Accession 
#: n/a; Signal sequence: none; Predicted TM domains: 117-133, 179-195, 211-227, 235-251, 
266-282, 296-312; PFAM domains: none; Summary: A type III membrane protein of unknown 
function; is adjacent to HER2 on the genome, and its overexpression in breast cancer is highly 
correlated with HER2 expression; may be used to predict HER2 overexpression and amplification. 

RVDPRVRGER MAGLAARLVL LAGAAALASG SQGDREPVYR DCVLQCEEQN CSGGALNHFR 6 0 

SRQPIYMSI*A GWTCRDDCKY ECMWVTVGLY LQEGHKVPQF HGKWPFSRFL FFQEPASAVA 12 0 

SFLNGLASLV MLCRYRTFVP ASSPMYHTCV AFAWVSLNAW FWSTVFHTRD TDLTEKMDYF 180 

CASTVILHSI YLCCVRTVGL QHPAWSAFR ALLLLMLTVH VSYLSLIRFD YGYNLVANVA 24 0 

IGLVNWWWL AWCLWNQRRL PHVRKCWW LLLQGLSLLE LLDFPPLFWV LDAHAIWHIS 3 00 
TIPVHVLFFS FLEDDSLYLL KESEDKFKLD 



BCN4 DNA sequence (SEQ ID NO: 18) 

Gene name: ESTs; Unigene number: Hs . 283713; Probeset Accession #: F13673; Nucleic Acid 
Accession #: n/a; Coding sequence: 143-874 (start and stop codons underlined) 

GGGAGGGAGA GAGGCGCGCG GGTGAAAGGC GCATTGATGC AGCCTGCGGC GGCCTCGGAG 60 

CGCGGCGGAG CCAGACGCTG ACCACGTTCC TCTCCTCGGT CTCCTCCGCC TCCAGCTCCG 12 0 

CGCTGCCCGG CAGCCGGGAG CCATGCGACC CCAGGGCCCC GCCGCCTCCC CGCAGCGGCT 180 

CCGCGGCCTC CTGCTGCTCC TGCTGCTGCA GCTGCCCGCG CCGTCGAGCG CCTCTGAGAT 24 0 

CCCCAAGGGG AAGCAAAAGG CGCAGCTCCG GCAGAGGGAG GTGGTGGACC TGTATAATGG 3 00 

AATGTGCTTA CAAGGGCCAG CAGGAGTGCC TGGTCGAGAC GGGAGCCCTG GGGCCAATGG 3 60 

CATTCCGGGT ACACCTGGGA TCCCAGGTCG GGATGGATTC AAAGGAGAAA AGGGGGAATG 420 

TCTGAGGGAA AG CTTTGAGG AGTCCTGGAC ACCCAACTAC AAGCAGTGTT CATGGAGTTC 480 

ATTGAATTAT GGCATAGATC TTGGGAAAAT TGCGGAGTGT ACATTTACAA AGATGCGTTC 54 0 

AAATAGTGCT CTAAGAGTTT TGTTCAGTGG CTCACTTCGG CTAAAATGCA GAAATGCATG 600 

CTGTCAGCGT TGGTATTTCA CATTCAATGG AGCTGAATGT TCAGGACCTC TTCCCATTGA 66 0 

AGCTATAATT T ATTTGG AC C AAGGAAGCCC TGAAATGAAT TCAACAATTA ATATTCATCG 720 

CACTTCTTCT GTGGAAGGAC TTTGTGAAGG AATTGGTGCT GGATTAGTGG ATGTTGCTAT 78 0 

CTGGGTTGGC ACTTGTTCAG ATTAC CCAAA AGGAGATGCT TCTACTGGAT GGAATTCAGT 84 0 

TTCTCGCATC ATTATTGAAG AACTACCAAA ATAAATGCTT TAATTTTCAT TTGCTACCTC 90 0 

TTTTTTTATT ATGCCTTGGA ATGGTTCACT TAAATGACAT TTTAAATAAG TTTATGTATA 96 0 

CATCTGAATG AAAAGCAAAG CTAAATATGT TTACAGACCA AAGTGTGATT TCACACTGTT 102 0 

TTTAAATCTA GCATTATTCA TTTTGCTTCA ATCAAAAGTG GTTTCAATAT TTTTTTTAGT 108 0 

TGGTTAGAAT ACTTTCTTCA TAGTCACATT CTCTCAACCT ATAATTTGGA ATATTGTTGT 114 0 

GGTCTTTTGT TTTTT CTCTT AGTATAGCAT TTTT AAAAAA ATATAAAAGC TACCAATCTT 12 0 0 

TGTACAATTT GTAAATGTTA AGAATTTTTT TTATATCTGT TAAATAAAAA TTATTTCCAA 12 6 0 
CAACCTTAAA AAAAAAAAAA AAAA 

BCN4 Protein sequence (SEQ ID NO:19) 

Gene name: ESTs; Unigene number: Hs . 283713; Probeset Accession ft: F13673; Protein Accession 

#: n/a; Predicted Signal sequence: 1-30; TM domains: none; PFAM domains: none; Summary: a 
secreted protein; has a mouse orthologue (see sequence below) . 



102 



MRPQGPAASP QRLRGLLLLL LLQLPAPSSA SEIPKGKQKA QLRQREWDL YNGMCLQGPA 60 

GVPGRDGSPG ANGIPGTPGI PGRDGFKGEK GECLRESFEE SWTPNYKQCS WSSLNYGIDL 120 

GKIAECTFTK MRSNSALRVL FSGSLRLKCR NACCQRWYFT FNGAECSGPL PIEAIIYLDQ 180 

GSPEMNSTIN IHRTSSVEGL CEGIGAGLVD VAIWVGTCSD YPKGDASTGW NSVSRIIIEE 240 
LPK 

Mouse BCN4 Protein sequence (SEQ ID NO: 20) 
Gene name: ESTs; Unigene number: Mm. 41556 

XXXXAAPPQL LLGLFLVLLL LLQLSAPSSA SENPKVKQKA LIRQREWDL YNGMCLQGPA 6 0 

GVPGRDGSPG ANGIPGTPGI PCQDGFKGEK GECLRESFEE SWTPNYKQCS WSSLNYGIDL 120 

GKIAECTFTK MRSNSALRVL FSGSLRLKCR NACCQRWYFT FNGAECSGPP PIEAIXXXXX 180 

XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXSD YPKGDAYTGW DSVSRIIIEE 24 0 
LPK 
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